Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Assignment 5: Regression Finding the better model In a statistics class last spr

ID: 3337601 • Letter: A

Question

Assignment 5: Regression Finding the better model In a statistics class last spring the students measured their height, their arm span (finger tip to finger tip), and the length of their forearm (elbow to finger tip). All distances were measured in inches We collected data to answer this question: which is a better predictor of someone's height, their arm span or height? their forearm length? In other words, will someone's forearm length or their arm span more accurately predict their Listed below are the data that were collected Forearm Student Height span 60.5 68 60 64.5 63.5 61.5 67 67 62 67 61 65 62.5 62.5 68 71.5 17.5 16.6 16.5 18 18.5 To do: Use your skills from Chapter 6 8 to create the better linear regression line to predict a person's height. You'll need to do two linear regressions then determine which equation is "better" than the other. (Making a regression line is also called "building a model" because our whole goal is to find a line that is a really good estimate of the patter of data points. Just like modeling clay is shaped to look like an object, the regression model is shaped to the pattern of the data.)

Explanation / Answer

Used R software:

Built 3 Regression models

mod1 is height or armspan

mod2 is height on forearmlength

mod3 is height on armspan and forearmlength

Rcode:

armspan <- c(60.5,68,60,64.5,63.5,61.5,67,67)

Forearmlength <- c(16,17.5,16.6,17,17,16.5,18,18.5)
Height <- c(62,67,61,65,62.5,62.5,68,71.5)
mod1 <- lm(Height~armspan)
mod2 <- lm(Height~Forearmlength)
mod3 <- lm(Height~armspan+Forearmlength)

output:

summary(mod1)
summary(mod2)
summary(mod3)

Call:

lm(formula = Height ~ armspan)

Residuals:

Min 1Q Median 3Q Max

-2.0245 -0.8179 0.0571 0.2717 3.4973

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.4538 14.3211 -0.032 0.97575

armspan 1.0217 0.2235 4.571 0.00381

  

(Intercept)

armspan **

---

Signif. codes:  

0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.857 on 6 degrees of freedom

Multiple R-squared: 0.7769, Adjusted R-squared: 0.7397

F-statistic: 20.89 on 1 and 6 DF, p-value: 0.003807

Call:

lm(formula = Height ~ Forearmlength)

Residuals:

Min 1Q Median 3Q Max

-1.8692 -0.8057 0.3808 0.7059 1.7640

Coefficients:

Estimate Std. Error t value

(Intercept) -5.8948 10.8513 -0.543

Forearmlength 4.1332 0.6325 6.534

Pr(>|t|)   

(Intercept) 0.606544   

Forearmlength 0.000614 ***

---

Signif. codes:  

0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.38 on 6 degrees of freedom

Multiple R-squared: 0.8768, Adjusted R-squared: 0.8562

F-statistic: 42.7 on 1 and 6 DF, p-value: 0.0006138

Call:

lm(formula = Height ~ armspan + Forearmlength)

Residuals:

1 2 3 4 5 6

1.6751 -0.2954 -1.0357 0.3359 -1.8563 0.3133

7 8

-0.5415 1.4045

Coefficients:

Estimate Std. Error t value

(Intercept) -8.0249 11.3203 -0.709

armspan 0.3078 0.3493 0.881

Forearmlength 3.1079 1.3301 2.337

Pr(>|t|)  

(Intercept) 0.5101  

armspan 0.4185  

Forearmlength 0.0667 .

---

Signif. codes:  

0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.406 on 5 degrees of freedom

Multiple R-squared: 0.8933, Adjusted R-squared: 0.8507

F-statistic: 20.94 on 2 and 5 DF, p-value: 0.003715

mod1 regression eq

height=-0.4538+1.0217(armspan)

armspan is significant variable

mod2

height=-5.8948+4.1332(forearmlength)

forearmlength is also significant variable

mod3:

height=-8.0249+0.3078(armspan)+3.1079(forearmlength)

armspan and forarmlength are not significant variables.

For models 1&2 correlation is

cor(Height,armspan)

=0.8814

r=0.8814

there exists a strong positive relationship between height and armspan.as armspan increases height increases

cor(Height,Forearmlength)

r=0.9364

there exists a strong positive relationship between height and Forearmlength..as Forearmlength increases height increases and viceversa.

Computing coefficient of determination for both models

mod1:

r2 =0.7769

77.69% variation in height is explained by armspan

mod2:

r2 =0.8768

87.68% variation in height is explained by forearmlength.very good model.

mod2 is good model as coefficient of determination is high.

USE FOREARMLENGTH TO PREDICT HEIGHT.