Statistics: Multiple Regression and Interaction Terms
Interpreting models with more than one predictor
Statistics: Multiple Regression and Interaction Terms
Interpreting models with more than one predictor
Statistics - Grade 9-12
- 1
A regression model predicts a student's exam score from hours studied and hours slept: predicted score = 48 + 6.2(hours studied) + 3.5(hours slept). What is the predicted score for a student who studied 5 hours and slept 7 hours?
Plug each value into the equation before adding.
The predicted score is 103.5. Substitute 5 for hours studied and 7 for hours slept: 48 + 6.2(5) + 3.5(7) = 48 + 31 + 24.5 = 103.5. - 2
In the model predicted house price in thousands of dollars = 120 + 35(bedrooms) + 0.8(square feet in hundreds), interpret the coefficient 35 for bedrooms.
The coefficient 35 means that, holding square footage constant, each additional bedroom is associated with an increase of 35 thousand dollars in the predicted house price. - 3
A model predicts monthly sales using advertising spending and store size: predicted sales = 20 + 4(ad spending in thousands) + 1.5(store size in hundreds of square feet). Compare two stores that are the same size. If Store A spends $3,000 more on advertising than Store B, how much higher are Store A's predicted monthly sales?
Use only the advertising coefficient because store size is held constant.
Store A's predicted monthly sales are 12 units higher. Since advertising is measured in thousands of dollars, $3,000 is 3 units, and 4(3) = 12. - 4
A regression equation is predicted final grade = 50 + 4(homework hours) + 2(class participation score). A student has 6 homework hours and a class participation score of 8. Another student has 4 homework hours and a class participation score of 10. Which student has the higher predicted final grade, and by how much?
Both students have the same predicted final grade of 90. The first student has 50 + 4(6) + 2(8) = 90, and the second student has 50 + 4(4) + 2(10) = 90, so the difference is 0. - 5
A multiple regression model uses temperature and number of customers to predict daily ice cream sales. The coefficient for temperature is 18 when number of customers is also in the model. What does it mean to say the coefficient for temperature is interpreted while holding number of customers constant?
Holding a variable constant means pretending that variable stays the same while the other variable changes.
It means the model compares days with the same number of customers. For each 1 degree increase in temperature, predicted sales increase by 18 units, assuming the number of customers does not change. - 6
A model for predicted test score includes an interaction term: predicted score = 40 + 5(study hours) + 3(tutor) + 2(study hours × tutor), where tutor = 1 if the student had a tutor and 0 if not. What is the predicted score for a student who studied 4 hours and had a tutor?
When tutor = 1, the interaction term is study hours multiplied by 1.
The predicted score is 71. Substitute study hours = 4 and tutor = 1: 40 + 5(4) + 3(1) + 2(4 × 1) = 40 + 20 + 3 + 8 = 71. - 7
Using the same model, predicted score = 40 + 5(study hours) + 3(tutor) + 2(study hours × tutor), what is the predicted score for a student who studied 4 hours and did not have a tutor?
The predicted score is 60. Substitute study hours = 4 and tutor = 0: 40 + 5(4) + 3(0) + 2(4 × 0) = 40 + 20 + 0 + 0 = 60. - 8
Compare the two students from Problems 6 and 7. According to the model, how much higher is the predicted score for the student with a tutor after 4 hours of studying?
Subtract the two predicted scores.
The predicted score is 11 points higher for the student with a tutor. The tutored student's predicted score is 71, and the non-tutored student's predicted score is 60, so 71 - 60 = 11. - 9
A model predicts plant height in centimeters: predicted height = 10 + 4(water) + 6(fertilizer) + 3(water × fertilizer), where water is measured in liters and fertilizer = 1 if fertilizer was used and 0 if not. What is the effect of one additional liter of water when fertilizer is not used?
Set fertilizer equal to 0 and look at how the prediction changes when water increases by 1.
When fertilizer is not used, one additional liter of water increases predicted plant height by 4 centimeters. This is because fertilizer = 0, so the interaction term adds 0 to the water effect. - 10
Using the model predicted height = 10 + 4(water) + 6(fertilizer) + 3(water × fertilizer), what is the effect of one additional liter of water when fertilizer is used?
When fertilizer is used, one additional liter of water increases predicted plant height by 7 centimeters. The water coefficient is 4, and the interaction adds 3 more when fertilizer = 1, so 4 + 3 = 7. - 11
A table shows predictions from a regression model with an interaction between exercise hours and diet plan. The predicted weight loss is 2 pounds for 0 exercise hours with no diet plan, 5 pounds for 1 exercise hour with no diet plan, 4 pounds for 0 exercise hours with the diet plan, and 9 pounds for 1 exercise hour with the diet plan. Does the effect of exercise appear to be the same for both diet groups? Explain.
Find the change from 0 to 1 exercise hour separately for each diet group.
The effect of exercise is not the same for both diet groups. With no diet plan, going from 0 to 1 exercise hour increases predicted weight loss by 3 pounds, from 2 to 5. With the diet plan, it increases predicted weight loss by 5 pounds, from 4 to 9. This difference suggests an interaction. - 12
A scatterplot shows that income and years of education are both positively related to spending on books. A multiple regression model includes both predictors. Why might the coefficient for education be different in the multiple regression model than in a simple regression model using education alone?
Think about how two predictors can overlap in what they explain.
The coefficient for education may be different because multiple regression separates the effect of education from the effect of income. If education and income are related to each other, a simple regression using only education may mix together the effects of both variables.