I\'ve atttempted to work through the problems and will continue to do so, but I
ID: 356021 • Letter: I
Question
I've atttempted to work through the problems and will continue to do so, but I wanted to see if you had any direction. Thanks!
On the last page of this file you will find the Excel output for a multiple linear regression model. The model was built in an attempt to better understand why students at area high schools perform differently on the state high school mathematics exam. The average test score for a class of students is what we are trying to predict. In our attempt to understand why these exam scores differ, we use 3 independent variables: a rating (0-100) for the quality of the math degree obtained by the instructor, the age of the instructor, and the salary (in thousands) of the instructor. You are to address the following questions based on the output. Worth 25 points total.
Estimate the average math score for a class of students whose instructor is 52 years old, earns $48,000, and got her degree in a math program rated 72.
Y=35.68+b1(x)+b2(x2)+B2(X3)
Y=35.67+ 72(.25)+ 52(.24)+48,000(.13)
Y=6306.15 (Income swayed this too much???)
What percentage of the variations in math scores can be explained by this model?
~35.70% (R Squared %)
Conduct a test to determine if the model, taken as a whole, provided us with any significant explanation of the differences in math scores. That is, should the model be retained for further analysis?
Which of the independent variables appear to be significant to the model? Which appear to be insignificant? What leads you to these conclusions?
The Math Degree is the only significant variable within this scenario as the “T Stat” is greater than 2. The other T Stats are so small in value that they are insignificant to the dependent variable.
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.597512233
R Square
0.357020869
Adjusted R Square
0.303439274
Standard Error
7.724526046
Observations
40
ANOVA
df
SS
MS
F
Significance F
Regression
3
1192.732105
397.5774
6.663125
0.001076925
Residual
36
2148.058895
59.6683
Total
39
3340.791
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Intercept
35.67761801
7.278849159
4.901547
2.03E-05
20.9154278
Math Degree
0.247481581
0.069845662
3.543263
0.001115
0.105828014
Age
0.244830604
0.185213036
1.321886
0.194545
-0.130798841
Income
0.133296712
0.152818937
0.872253
0.388851
-0.176634456
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.597512233
R Square
0.357020869
Adjusted R Square
0.303439274
Standard Error
7.724526046
Observations
40
ANOVA
df
SS
MS
F
Significance F
Regression
3
1192.732105
397.5774
6.663125
0.001076925
Residual
36
2148.058895
59.6683
Total
39
3340.791
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Intercept
35.67761801
7.278849159
4.901547
2.03E-05
20.9154278
Math Degree
0.247481581
0.069845662
3.543263
0.001115
0.105828014
Age
0.244830604
0.185213036
1.321886
0.194545
-0.130798841
Income
0.133296712
0.152818937
0.872253
0.388851
-0.176634456
Explanation / Answer
(1)
Y = 35.68 + 0.247*Math Degree + 0.245*Age + 0.133*Income
Given, Math Degree = 72; Age = 52; Income = 48 (the model is formed using '000)
Y = 35.68 + 0.247*72 + 0.245*52 + 0.133*48 = 72.59
(2)
Yes, 35.7% i.e. R-squared value
(3)
Note the 'Significance F' statistic of the ANOVA. It is coming as 0.00107 which is less than 0.05. This means that the Mean Squared of the Model (MSM) is much less with respect to the Mean Square Error (MSE) because the F is calculated as MSM/MSE. So, the model 'as a whole' can be used for further analysis with 95% confidence.
(4)
n = no. of samples = 40
k = regression df = 3
Critical value = t0.05/2,n - (k+1) = t0.025,36 = 2.03
The 't Stat' values are more than this critical values only in two instances i.e. for the 'Intercept' and for the coefficient of 'Math Degree'. So, we reject the null hypothesis for these two variables that they ar equal to zero. Hence the values for these two variables become statistically significant at 95% confidence level.
Other easier way to determine this is to look at the P-values. When p-values are less than 0.05, we can say that the variable is statistically significant at 95% confidence level.