Problem 4: Consider the data shown below: Please use R, and show the codes, outp
ID: 3066684 • Letter: P
Question
Problem 4: Consider the data shown below:
Please use R, and show the codes, outputs, formulas, and brief explanation.
x
y
x
y
4
24.6
6.5
67.11
4
24.71
6.5
67.24
4
23.9
6.75
67.15
5
39.5
7
77.87
5
39.6
7.1
80.11
6
57.12
7.3
84.67
a.- Fit a second-order polynomial model to these data.
b.- Test for significance of regression.
c.- Test for lack of fit and comment on the adequacy of the second-order model.
d.- Test the hypothesis H0: B2=0. Can the quadratic term be deleted from this equation?
x
y
x
y
4
24.6
6.5
67.11
4
24.71
6.5
67.24
4
23.9
6.75
67.15
5
39.5
7
77.87
5
39.6
7.1
80.11
6
57.12
7.3
84.67
Explanation / Answer
x=c(4,4,4,5,5,6,6.5,6.5,6.75,7,7.1,7.3)
> y=c(24.6,24.71,23.9,39.5,39.6,57.12,67.11,67.24,67.15,77.87,80.11,84.61)
> ##a)
> z=lm(y~x+I(x^2))
> z
Call:
lm(formula = y ~ x + I(x^2))
Coefficients:
(Intercept) x I(x^2)
-4.666 1.466 1.459
> ##b)
## testing:
H0: beta is equal to zero.
Vs
H1: beta is not equal to zero.
> fit=lm(y~x)
> fit
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
-47.40 17.68
> summary(fit)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-4.7648 -1.4072 0.1688 1.4367 2.9735
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -47.3966 3.0389 -15.60 2.40e-08 ***
x 17.6758 0.5157 34.28 1.06e-11 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.205 on 10 degrees of freedom
Multiple R-squared: 0.9916, Adjusted R-squared: 0.9907
F-statistic: 1175 on 1 and 10 DF, p-value: 1.057e-11
> ##INTERPRETATION: Here p_value is 1.057e-11 which is much less than 0.05.therefore we reject the null hypothesis that beta=0.Hence there is significant relation between the variables in the linear regression model.
> ##c)
## testing:
H0: there is no lack of fit in the given data.
Vs
H1: there is lack of fit in the given data.
> anova(lm(y~x+I(x^2)),lm(y~factor(x)*factor(I(x^2))))
Analysis of Variance Table
Model 1: y ~ x + I(x^2)
Model 2: y ~ factor(x) * factor(I(x^2))
Res.Df RSS Df Sum of Sq F Pr(>F)
1 9 24.6204
2 4 0.3995 5 24.221 48.5 0.001133 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> ##INTERPRETATION: Here p_value is 0.001133 which is less than 0.05.therefore we reject the null hypothesis that there is no lack of fit. Hence the second order polynomial model is inadequate.