Proportions and Chi-Square Tests
Many statistical questions involve counts rather than measurements: how many respondents prefer each candidate, how many parts are defective, how many patients improved, or whether two categorical variables are associated. The Lane text treats proportions and chi-square tests as central tools for categorical data. They connect probability, sampling distributions, and hypothesis testing in a setting where means and standard deviations are not the natural summaries.
The main distinction is between one categorical variable and two categorical variables. With one variable, we may estimate or test a population proportion, or test whether several category probabilities match a claimed distribution. With two variables, we ask whether the variables are independent or associated in a population. The calculations use counts, but the interpretation must return to proportions, context, and design.
Definitions
A sample proportion is
where is the number of sampled cases with the characteristic and is the sample size.
A one-proportion z test tests
with statistic
The null value is used in the standard error because the p-value is computed under the null hypothesis.
A confidence interval for a proportion often has the approximate form
For small samples or proportions near 0 or 1, improved intervals such as Wilson or exact methods are preferable.
A chi-square goodness-of-fit test compares observed counts in one categorical variable with expected counts from a hypothesized distribution:
The degrees of freedom are often when no parameters are estimated from the data.
A contingency table displays counts for two categorical variables. A chi-square test of independence asks whether row and column variables are independent in the population. If row total , column total , and grand total are known, the expected count under independence is
The test statistic is again
summed over all cells.
Key results
For a one-proportion z test, the normal approximation works best when expected successes and failures under the null are sufficiently large:
is a common introductory rule. For intervals, the analogous check often uses and .
For a two-way table with rows and columns, the degrees of freedom for the chi-square independence test are
The chi-square statistic is always nonnegative. Larger values indicate larger discrepancies between observed and expected counts. The p-value is an upper-tail probability:
Expected counts, not observed counts, determine whether the large-sample chi-square approximation is reliable. A common rule is that all expected counts should be at least 5, though modern practice is more nuanced. When counts are small in a table, Fisher's exact test is often used.
For a table, effect sizes include risk difference, relative risk, odds ratio, and phi. For larger tables, Cramer's summarizes association strength:
The p-value answers whether the data are surprising under independence; effect size answers how strong the association is.
Percentages should usually be reported in the direction that matches the question. If the rows are class years and the question is whether first-year and upper-year students prefer different formats, compare row percentages. If the columns are formats and the question is what kinds of students choose each format, compare column percentages. The same table can support both summaries, but mixing denominators in one sentence creates confusion. Always name the denominator: "60% of commuters work 10+ hours" is clearer than "60% work," especially when several groups appear in the table.
Visual
| Test | Data | Null hypothesis | Statistic | df |
|---|---|---|---|---|
| One-proportion z | binary counts | standardized | normal | |
| Goodness-of-fit | one categorical variable | probabilities match claim | ||
| Independence | two categorical variables | variables independent | ||
| Fisher exact | small table | variables independent | hypergeometric probability | exact |
Worked example 1: One-proportion z test
Problem: A city claims that 60% of commuters use public transit at least once per week. In a random sample of 250 commuters, 135 report using public transit weekly. Test the claim against a two-sided alternative at .
Method:
- State hypotheses:
- Compute the sample proportion:
- Check expected counts under the null:
Both are large enough for the normal approximation.
- Compute the null standard error:
- Compute the z statistic:
- Two-sided p-value:
Answer: Since , fail to reject at the 5% level. The data are close to the threshold but do not provide statistically significant evidence that the true proportion differs from 60% at .
Checked answer: The sample proportion is 6 percentage points below the claim. With percentage points, the observed difference is just under two standard errors.
Worked example 2: Chi-square test of independence
Problem: A survey records preferred study format and class year for 180 students.
| Online | Hybrid | In-person | Total | |
|---|---|---|---|---|
| First-year | 18 | 22 | 20 | 60 |
| Upper-year | 24 | 28 | 68 | 120 |
| Total | 42 | 50 | 88 | 180 |
Test whether study-format preference is independent of class year.
Method:
- State hypotheses:
- Compute expected counts using .
For first-year online:
First-year hybrid:
First-year in-person:
Upper-year expected counts are 28, 33.33, and 58.67.
- Compute contributions:
- Sum:
- Degrees of freedom:
- The p-value is .
Answer: Reject independence at . There is evidence of an association between class year and preferred study format. The table suggests upper-year students choose in-person more often than expected under independence.
Checked answer: All expected counts exceed 5, so the chi-square approximation is reasonable. The largest discrepancy is first-year in-person, where the observed count 20 is much lower than expected 29.33.
Code
import numpy as np
from scipy import stats
# One-proportion z test
x, n, p0 = 135, 250, 0.60
phat = x / n
se0 = np.sqrt(p0 * (1 - p0) / n)
z = (phat - p0) / se0
p_value = 2 * stats.norm.cdf(-abs(z))
print(f"z = {z:.3f}, p = {p_value:.4f}")
# Chi-square independence test
table = np.array([[18, 22, 20],
[24, 28, 68]])
chi2, p, df, expected = stats.chi2_contingency(table, correction=False)
print(f"chi2 = {chi2:.3f}, df = {df}, p = {p:.4f}")
print(expected)
The chi-square function returns the expected-count table. Always inspect it; a p-value is not enough if the approximation conditions are poor.
Common pitfalls
- Using the sample proportion in the standard error for a hypothesis test instead of the null proportion.
- Treating the chi-square statistic as if negative discrepancies cancel positive discrepancies. Squaring prevents cancellation.
- Interpreting a significant chi-square test without examining which cells drive the association.
- Forgetting that counts must be independent; repeated responses from the same person break the usual test.
- Using chi-square tests for percentages without the underlying counts.
- Reporting association as causation in an observational contingency table.