Suppose we obtain a data set of n- 40 from a car insurance company. For each ind
ID: 3317907 • Letter: S
Question
Suppose we obtain a data set of n- 40 from a car insurance company. For each individual, we obtain their age, their gender, and the monthly price of their car insurance. In order to include gender in our model, we create an indicator variable "male" which is equal to 1 if the individual is male and equal to O otherwise. We fit the following model Price = 0 + 1 * age + 2 * male + 3 * age * male We obtain the following R output: model ItI) 15.3079 11.102 3.54e-13 x* (Intercept) 169.9502 age male age male 3.1192 100.2111 2.1295 0.4936 -6.319 2.61e-07 21.4729 4.667 4.14e-05 0.6967-3.057 0.0042x Signif. codes: 0*0.001 '**0.01 '*0.05'.' 0.1''1 Residual standard error 15.56 on 36 degrees of freedom Multiple R-squared: 0.8558, F-statistic: 71.19 on 3 and 36 DF,p-value: 3.32e-15 Adjusted R-squared: 0.8437 1) Write down the full fitted model, as well as the two "sub-models" fit within this model. i.e. give the different fitted models for male versus female) 2) Interpret the coefficient of the interaction between age and male. Interpret the t-test R performed for this coefficient. What can we conclude from this test? (i.e. what does this tell us about the difference between the two "sub- models"?)Explanation / Answer
Full model
price=169.9502-3.1192*age+100.2111*male - 2.1295*age*male
the submodel for male (here male=1)
price=169.9502-3.1192*age+100.2111*1 - 2.1295*age*1=270.1613-5.2487*age
the submodel for female( here male=0)
price=169.9502-3.1192*age+100.2111*0 - 2.1295*age*0==169.9502-3.1192*age
(2) since p-value of t-statisitcs of interaction between age and male is negative and significant at alpha=0.05
as p-value=0.0042 is less than 0.05