Please do it by R. Recall the hw problem (hw_lect14_1) where you showed that add
ID: 3269505 • Letter: P
Question
Please do it by R.
Recall the hw problem (hw_lect14_1) where you showed that adding useless predictors to a regression model can increase R2. This time, suppose the true/pop fit is y- l,(i.e., no x at all), and so a possible sample from the population could be the following: # Use this line to make sure we all get the same answes set.seed(123) n=20 y1+rnorm(n,0,1) a) In the old hw, there was one useful predictor (x), and 4 useless predictors (x2,x3,x4,x5). Here, revise that code to have data on 10 useless predictors(and no useful predictors), fit the model y alpha+betal xl +... betal0 x10, performs the test of model utility, and perform t-tests on each of the 10 coefficients to see if they are zero. Just show your code. By R b) According to the F-test of model utilitt, are any of the predictors useful at alpha 0.1? c) According to the t-tests, are any of the predictors useful at alpha0.1? See the solns to make sure you underst the moral of this exercise.Explanation / Answer
a.)
set.seed(123)
n=10
x1=rnorm(n,0,1)
x2=rnorm(n,0,1)
x3=rnorm(n,0,1)
x4=rnorm(n,0,1)
x5=rnorm(n,0,1)
y=1+rnorm(n,0,1)
lm.output=lm(y~x1+x2+x3+x4+x5)
summary(lm.output)
Call:
lm(formula = y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 +
x10)
Residuals:
Min 1Q Median 3Q Max
-0.59959 -0.34452 -0.02253 0.30164 0.70460
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.96288 0.24775 3.887 0.00369 **
x1 -0.89316 0.43102 -2.072 0.06812 .
x2 -0.09599 0.34559 -0.278 0.78747
x3 -0.28753 0.35285 -0.815 0.43617
x4 -0.85735 0.33803 -2.536 0.03190 *
x5 -0.11829 0.35873 -0.330 0.74914
x6 -0.26424 0.34314 -0.770 0.46100
x7 -0.42532 0.51663 -0.823 0.43164
x8 0.63280 0.48104 1.315 0.22087
x9 -0.53676 0.39519 -1.358 0.20746
x10 -0.14151 0.41154 -0.344 0.73886
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.6048 on 9 degrees of freedom
Multiple R-squared: 0.6475, Adjusted R-squared: 0.2558
F-statistic: 1.653 on 10 and 9 DF, p-value: 0.2312
To manually perform t-test for each parameter, calculate t-stat=estimate/std. error
compare t-stat with t value with n-(p+1) =20-11=9 degree of freedom
e.g for x1 t -stat=-0.89316 /0.43102=-2.072
qt(0.95,9)=1.833113
since 2.072.1.833113, x1 is significant at alpha=0.1
to calculate f-stat,f-stat=(sum of squared regression/p)/(sum of squared errors/(n-p-1)
f-stat=(sum((lm.output$fitted.values-mean(lm.output$fitted.values))^2)/10 )/(sum(lm.output$residuals^2)/9)
= 0.6046627/0.365744= 1.653241
To check for significance ,compare the f-stat with f value with 10 and 9 dof and alpha=0.1
qf(0.95,10,9)=3.13728
Since 1.653241 < 3.13728 f- stat is not significant at alpha =0.1
b.)according to f-test,p-value=0.2312.
so as per the f-test ,we cannot reject the null that all predictors are 0at 0.1 significance level.
None of the predictors are significant at alpha=0.1
c.)according to t-test,predictors x1 and x4 are significant at 0.1 significance level