Categorical Data & Chi-Square Lab
Test whether observed categorical data matches an expected distribution (goodness of fit) or whether two categorical variables are independent. Build contingency tables, compute expected counts, and interpret p-values from the chi-square statistic.
Guided Experiment: Testing a Fair Die
If a die is fair, each face should appear with equal probability (1/6). How can we test whether observed rolls deviate significantly from this expectation? What p-value would lead us to question the die's fairness?
Write your hypothesis in the Lab Report panel, then click Next.
Controls
Results
There is significant evidence that the observed distribution differs from the expected distribution.
| Category | Observed | Expected | Contribution |
|---|---|---|---|
| 1 | 18 | 16.67 | 0.107 |
| 2 | 15 | 16.67 | 0.167 |
| 3 | 23 | 16.67 | 2.407 |
| 4 | 25 | 16.67 | 4.167 |
| 5 | 8 | 16.67 | 4.507 |
| 6 | 11 | 16.67 | 1.927 |
Visualization
Observed vs Expected Counts
Contribution to χ² (heatmap)
Data Table
(0 rows)| # | Trial | Test Type | χ² Statistic | df | p-value | Conclusion | Largest Contributor |
|---|
Reference Guide
The Chi-Square Statistic
The chi-square statistic measures how far observed counts deviate from expected counts across all categories.
Larger values indicate greater disagreement between observed and expected data. The statistic is always non-negative.
Goodness of Fit
Tests whether a single categorical variable follows a hypothesized distribution. Expected counts come from the hypothesized proportions.
Where n is the total sample size, p_i is the expected proportion for category i, and k is the number of categories.
Test of Independence
Tests whether two categorical variables are associated. Expected counts are computed from row and column totals.
If the variables are truly independent, observed counts should be close to these expected counts.
Conditions & Interpretation
For the chi-square approximation to be valid, all expected counts should be at least 5. The p-value gives the probability of observing a chi-square statistic this extreme if the null hypothesis were true.
A small p-value (below the significance level) provides evidence against the null hypothesis.