Statistics
Grade 11-12
Multiple Regression Cheat Sheet
A printable reference covering multiple regression equations, coefficients, residuals, $R^2$, adjusted $R^2$, and prediction for grades 11-12.
Related Tools
Related Labs
Related Worksheets
Multiple regression is used to predict one response variable from two or more explanatory variables. This cheat sheet helps students organize the model equation, interpret coefficients, and check how well a model fits data. It is especially useful when several factors may affect the same outcome, such as predicting test scores from study time, attendance, and sleep.
Key Facts
- A multiple regression model with two predictors is written as , where is the predicted response.
- The intercept is the predicted value of when all explanatory variables equal , if that situation makes sense in context.
- A slope coefficient estimates the change in for a -unit increase in while all other predictors are held constant.
- A residual is the prediction error for one data point, calculated by .
- The coefficient of determination is the proportion of variation in explained by the regression model, with .
- Adjusted penalizes unnecessary predictors and is often better than for comparing models with different numbers of explanatory variables.
- Multicollinearity occurs when predictors are strongly related to each other, which can make coefficient estimates unstable and hard to interpret.
- A prediction should usually be made only within the range of the original data because extrapolation can be unreliable.
Vocabulary
- Multiple Regression
- A statistical method that predicts one response variable using two or more explanatory variables.
- Response Variable
- The variable being predicted or explained, usually represented by .
- Explanatory Variable
- A variable used to predict the response variable, often represented by , , and so on.
- Coefficient
- A number in the regression equation that shows how a predictor is associated with the predicted response when other predictors are held constant.
- Residual
- The difference between an observed value and its predicted value, calculated as .
- Multicollinearity
- A problem that occurs when explanatory variables are highly correlated with each other.
Common Mistakes to Avoid
- Interpreting a coefficient without saying other variables are held constant is wrong because each slope in multiple regression adjusts for the other predictors in the model.
- Assuming a larger always means a better model is wrong because adding more predictors can increase even when those predictors are not useful.
- Using the model far outside the data range is wrong because extrapolated predictions may not follow the same pattern seen in the sample.
- Treating correlation between predictors as harmless is wrong because strong multicollinearity can make slopes change dramatically when the model changes.
- Confusing residuals with predicted values is wrong because a residual measures error, while is the model's predicted response.
Practice Questions
- 1 A model predicts final exam score with , where is study hours and is hours of sleep. Find when and .
- 2 For one student, the observed score is and the predicted score is . Find the residual .
- 3 In the model , where is years of experience and is commute distance in miles, interpret the coefficient in context.
- 4 A model has a high , but two predictors are strongly correlated with each other. Explain why the model may still be difficult to interpret.