Correlation vs Causation Cheat Sheet
A printable reference covering correlation, causation, scatter plots, correlation coefficient, lurking variables, and experimental evidence for grades 8-12.
Related Tools
Related Labs
Related Worksheets
Correlation vs causation helps students decide whether two variables are simply related or whether one variable truly produces a change in another. This cheat sheet is useful because graphs, headlines, and studies often show patterns that can be misread. Students need clear rules for interpreting scatter plots, correlation values, and evidence claims. The goal is to make statistical conclusions careful, accurate, and supported by data. The core idea is that correlation measures association, while causation requires stronger evidence. A scatter plot shows direction, form, and strength, and the correlation coefficient summarizes the strength and direction of a linear relationship. A strong value of does not prove that one variable causes the other. To argue causation, students should look for controlled experiments, random assignment, plausible mechanisms, and possible lurking variables.
Key Facts
- Correlation means two variables are associated, so as one variable changes, the other tends to change in a pattern.
- A positive correlation means both variables tend to increase together, while a negative correlation means one tends to decrease as the other increases.
- The correlation coefficient measures the direction and strength of a linear relationship, with .
- Values of near or show a strong linear relationship, while values near show little or no linear relationship.
- The sample correlation coefficient can be calculated with .
- The coefficient of determination gives the fraction of variation in the response variable explained by a linear model.
- Correlation does not prove causation because a lurking variable may affect both variables or the direction of cause may be reversed.
- Strong evidence for causation usually comes from a controlled experiment with random assignment, comparison groups, and careful control of other variables.
Vocabulary
- Correlation
- A statistical relationship showing how two variables tend to change together.
- Causation
- A cause-and-effect relationship in which a change in one variable directly produces a change in another variable.
- Scatter Plot
- A graph of paired data values that helps show the direction, form, and strength of a relationship.
- Correlation Coefficient
- A number between and that describes the strength and direction of a linear relationship.
- Lurking Variable
- An unmeasured variable that may explain or influence the relationship between two studied variables.
- Controlled Experiment
- A study design that compares groups while controlling conditions so researchers can test for cause and effect.
Common Mistakes to Avoid
- Saying a strong correlation proves causation is wrong because a high value such as can still happen when another variable affects both quantities.
- Ignoring lurking variables is wrong because a hidden factor can create the pattern, such as temperature affecting both ice cream sales and swimming pool visits.
- Using for a curved relationship is wrong because the correlation coefficient measures linear association, not all possible patterns.
- Assuming means no relationship is wrong because the data may have a strong nonlinear pattern even when the linear correlation is near .
- Confusing direction with strength is wrong because the sign of shows direction, while the distance of from shows strength.
Practice Questions
- 1 A study finds that hours studied and test score have . Describe the direction and strength of the relationship.
- 2 A data set has . What is , and what does it mean in context for a linear model?
- 3 A city finds that daily temperature and lemonade sales have . Does this prove that buying lemonade raises the temperature? Explain briefly.
- 4 A school reports that students who join a math club have higher math scores than students who do not. Explain why this observation alone does not prove the club caused the higher scores.