Chi-Square Goodness of Fit
Overview
The Chi-Square Goodness of Fit test determines whether observed categorical data matches an expected distribution. Whether you’re testing if a die is fair, if color preferences follow a uniform distribution, or if demographic data matches census proportions, this test quantifies how well your observations align with theoretical expectations. Watch as the chi-square statistic measures the discrepancy between what you observe and what you expect.
Tips
- The chi-square statistic Σ(O-E)²/E sums squared standardized deviations across all categories
- Larger chi-square values indicate greater discrepancy between observed and expected frequencies
- Each category contributes to the total chi-square - examine which categories deviate most
- Degrees of freedom = number of categories - 1 (or -k if estimating k parameters)
- The test requires expected frequencies ≥ 5 in each category for validity
- Try the fair die example to see how random variation affects the test
- Compare uniform vs. custom expected distributions to understand flexibility
- P-value < 0.05 suggests the data doesn’t fit the expected distribution
- Visual comparison of observed vs expected bars makes discrepancies obvious