Assignment 5: Regression Finding the better model In a statistics class last spr
ID: 3337601 • Letter: A
Question
Assignment 5: Regression Finding the better model In a statistics class last spring the students measured their height, their arm span (finger tip to finger tip), and the length of their forearm (elbow to finger tip). All distances were measured in inches We collected data to answer this question: which is a better predictor of someone's height, their arm span or height? their forearm length? In other words, will someone's forearm length or their arm span more accurately predict their Listed below are the data that were collected Forearm Student Height span 60.5 68 60 64.5 63.5 61.5 67 67 62 67 61 65 62.5 62.5 68 71.5 17.5 16.6 16.5 18 18.5 To do: Use your skills from Chapter 6 8 to create the better linear regression line to predict a person's height. You'll need to do two linear regressions then determine which equation is "better" than the other. (Making a regression line is also called "building a model" because our whole goal is to find a line that is a really good estimate of the patter of data points. Just like modeling clay is shaped to look like an object, the regression model is shaped to the pattern of the data.)Explanation / Answer
Used R software:
Built 3 Regression models
mod1 is height or armspan
mod2 is height on forearmlength
mod3 is height on armspan and forearmlength
Rcode:
armspan <- c(60.5,68,60,64.5,63.5,61.5,67,67)
Forearmlength <- c(16,17.5,16.6,17,17,16.5,18,18.5)
Height <- c(62,67,61,65,62.5,62.5,68,71.5)
mod1 <- lm(Height~armspan)
mod2 <- lm(Height~Forearmlength)
mod3 <- lm(Height~armspan+Forearmlength)
output:
summary(mod1)
summary(mod2)
summary(mod3)
Call:
lm(formula = Height ~ armspan)
Residuals:
Min 1Q Median 3Q Max
-2.0245 -0.8179 0.0571 0.2717 3.4973
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.4538 14.3211 -0.032 0.97575
armspan 1.0217 0.2235 4.571 0.00381
(Intercept)
armspan **
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.857 on 6 degrees of freedom
Multiple R-squared: 0.7769, Adjusted R-squared: 0.7397
F-statistic: 20.89 on 1 and 6 DF, p-value: 0.003807
Call:
lm(formula = Height ~ Forearmlength)
Residuals:
Min 1Q Median 3Q Max
-1.8692 -0.8057 0.3808 0.7059 1.7640
Coefficients:
Estimate Std. Error t value
(Intercept) -5.8948 10.8513 -0.543
Forearmlength 4.1332 0.6325 6.534
Pr(>|t|)
(Intercept) 0.606544
Forearmlength 0.000614 ***
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.38 on 6 degrees of freedom
Multiple R-squared: 0.8768, Adjusted R-squared: 0.8562
F-statistic: 42.7 on 1 and 6 DF, p-value: 0.0006138
Call:
lm(formula = Height ~ armspan + Forearmlength)
Residuals:
1 2 3 4 5 6
1.6751 -0.2954 -1.0357 0.3359 -1.8563 0.3133
7 8
-0.5415 1.4045
Coefficients:
Estimate Std. Error t value
(Intercept) -8.0249 11.3203 -0.709
armspan 0.3078 0.3493 0.881
Forearmlength 3.1079 1.3301 2.337
Pr(>|t|)
(Intercept) 0.5101
armspan 0.4185
Forearmlength 0.0667 .
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.406 on 5 degrees of freedom
Multiple R-squared: 0.8933, Adjusted R-squared: 0.8507
F-statistic: 20.94 on 2 and 5 DF, p-value: 0.003715
mod1 regression eq
height=-0.4538+1.0217(armspan)
armspan is significant variable
mod2
height=-5.8948+4.1332(forearmlength)
forearmlength is also significant variable
mod3:
height=-8.0249+0.3078(armspan)+3.1079(forearmlength)
armspan and forarmlength are not significant variables.
For models 1&2 correlation is
cor(Height,armspan)
=0.8814
r=0.8814
there exists a strong positive relationship between height and armspan.as armspan increases height increases
cor(Height,Forearmlength)
r=0.9364
there exists a strong positive relationship between height and Forearmlength..as Forearmlength increases height increases and viceversa.
Computing coefficient of determination for both models
mod1:
r2 =0.7769
77.69% variation in height is explained by armspan
mod2:
r2 =0.8768
87.68% variation in height is explained by forearmlength.very good model.
mod2 is good model as coefficient of determination is high.
USE FOREARMLENGTH TO PREDICT HEIGHT.