Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Suppose we have a dataset consisting of 5 pairs of observations (Xi, Yi). Since

ID: 3129207 • Letter: S

Question

Suppose we have a dataset consisting of 5 pairs of observations (Xi, Yi). Since the Y-values are highly skewed we first transform Y to Z = log10(Y). Then we fit a simple linear regression model with response variable Z and predictor variable X. The residuals (on log10 scale) for the first 4 observations are: Observation 1 2 3 4 Residual (on log10 scale) 0.40 1.35 –0.99 –1.64 What is the residual (on the log10 scale) for the 5th observation? Given that Y5 = 11.75, what is Z5? Use the answers to (a) and (b) to calculate the predicted value for the 5th observation (i.e., Z _5)? Use the answer to (c) to calculate the predicted value of Y5 (on the original measurement scale)?

Explanation / Answer

a) We have 5 observations, so we get 5 residuals but 5th residual is not known.

     The 5th residual = 0.88 since sum of all residuals is zero (Residuals Property)

b) Given Y5 = 11.75 then Z5 = log10(Y5) = log10(11.75) = 1.07

c) Residual = actual value - predicted value (formula)

    0.88    = 1.07 - predicted value

therefore, predicted value = 0.19

d) The predicted value of Y5 is 10^0.19 = 1.5488