In the early part of the 20th century considerable scientific effort was focused
ID: 3151644 • Letter: I
Question
In the early part of the 20th century considerable scientific effort was focused on measuring genetic inheritance as part of a new field, called biometrics. To study genetic variation in human populations, a large data set (n>1000) of father's heights and their sons' heights was collected. A small part of the sample with n=10 is given below: (b) Fit a linear regression to predict the son's height from the father's height. Comment on the estimated slope and intercept. (b) Obtain the MSE as an estimate of the random variability in the experiment. (i) SST=SSy (ii) SSR= b * SSxy (iii) SSE = SST - SSR (iv) MSE = SSE/(n-2) (c) Give the ANOVA table (d) Test at significance level 0.01 whether the true slope has a value of 0. Comment on your findings. (e) What percent of the total variability in the sons' heights is explained by the linear regression of sons' heights on the fathers' heights? (f) What is the height of a son whose father's height is 6.5 feet?Explanation / Answer
Part b)
The linear regression for the prediction of the height of the son based on the height of the father is given as below:
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.736269089
R Square
0.542092171
Adjusted R Square
0.484853692
Standard Error
0.175156572
Observations
10
ANOVA
df
SS
MS
F
Significance F
Regression
1
0.290561404
0.290561
9.470765
0.015175223
Residual
8
0.245438596
0.03068
Total
9
0.536
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
1.106140351
1.500265218
0.737297
0.482004
-2.353477444
4.565758146
Father X
0.798245614
0.259384496
3.077461
0.015175
0.200103893
1.396387335
The regression equation is given as below:
Son’s height = 1.1061 + 0.7982*Father’s height
Where intercept = 1.1061 and slope is 0.7982
Slope indicate the unit change in the height of students as the unit change in height of father
b) The ANOVA table is given as below:
ANOVA
df
SS
MS
F
Significance F
Regression
1
0.290561404
0.290561
9.470765
0.015175223
Residual
8
0.245438596
0.03068
Total
9
0.536
d) Here, we have to test the significance for the slope. The t test for the significance of the slope gives the p-value as 0.015 which is greater than the given level of significance or alpha value 0.01, so we do not reject the null hypothesis that the slope is not significant.
e) About 54.21% of the variability in the dependent variable height of the son is explained by the independent variable height of the father.
f) Here, we have to find the estimated height of son when the height of father is given as 6.5 feet.
The regression equation is given as below:
Son’s height = 1.1061 + 0.7982*Father’s height
Son’s height = 1.1061 + 0.7982*6.5
Son’s height = 6.294737
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.736269089
R Square
0.542092171
Adjusted R Square
0.484853692
Standard Error
0.175156572
Observations
10
ANOVA
df
SS
MS
F
Significance F
Regression
1
0.290561404
0.290561
9.470765
0.015175223
Residual
8
0.245438596
0.03068
Total
9
0.536
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
1.106140351
1.500265218
0.737297
0.482004
-2.353477444
4.565758146
Father X
0.798245614
0.259384496
3.077461
0.015175
0.200103893
1.396387335