Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

The depth of wetting of a sol is the depth to which water content will increase

ID: 3222794 • Letter: T

Question

The depth of wetting of a sol is the depth to which water content will increase owing to external factors. The article "Discussion of Method for Evaluation of Depth of Wetting in Residential Areas" (J. Nelson, K. Chao, and D. Overton, Journal of Geotechnical and Geoenvironmental Engineering, 2011:293-296) discusses the relationship between depth of wetting beneath a structure and the age of the structure. The article presents measurements of depth of wetting, in meters, and the ages, in years, of 21 houses, as shown in the following table. a. Compute the least-squares line for predicting depth of wetting (y) from age (x). b. Identify a point with an unusually large x-value. Compute the least-squares line that results from deletion of this point. c. Identify another point which can be classified as an outlier. Compute the least-squares line that results from deletion of the outlier, replacing the point with the unusually large x-value. d. Which of these two points is more influential? Explain.

Explanation / Answer

Answer:

a).

The regression line y=2.9122+0.8773x

Regression Analysis

0.501

n

21

r

0.708

k

1

Std. Error

2.460

Dep. Var.

y

ANOVA table

Source

SS

df

MS

F

p-value

Regression

115.4571

1  

115.4571

19.08

.0003

Residual

114.9610

19  

6.0506

Total

230.4181

20  

Regression output

confidence interval

variables

coefficients

std. error

   t (df=19)

p-value

95% lower

95% upper

Intercept

2.9122

1.3192

2.208

.0398

0.1511

5.6733

x

0.8773

0.2008

4.368

.0003

0.4570

1.2977

Studentized

Studentized

Deleted

Observation

y

Predicted

Residual

Leverage

Residual

Residual

1

7.60

5.54

2.06

0.108

0.885

0.879

2

4.60

6.42

-1.82

0.074

-0.770

-0.761

3

6.10

8.18

-2.08

0.048

-0.865

-0.859

4

9.10

8.18

0.92

0.048

0.385

0.376

5

4.30

5.54

-1.24

0.108

-0.535

-0.525

6

7.30

9.93

-2.63

0.074

-1.112

-1.119

7

5.20

7.30

-2.10

0.054

-0.877

-0.872

8

10.40

9.93

0.47

0.074

0.198

0.193

9

15.50

8.18

7.32

0.048

3.051

4.158

10

5.80

4.67

1.13

0.154

0.501

0.491

11

10.70

8.18

2.52

0.048

1.051

1.054

12

5.50

6.42

-0.92

0.074

-0.389

-0.381

13

6.10

5.54

0.56

0.108

0.239

0.233

14

10.70

9.93

0.77

0.074

0.325

0.317

15

10.40

8.18

2.22

0.048

0.926

0.923

16

4.60

6.42

-1.82

0.074

-0.770

-0.761

17

7.00

9.05

-2.05

0.054

-0.858

-0.852

18

6.10

8.18

-2.08

0.048

-0.865

-0.859

19

16.80

15.19

1.61

0.474

0.900

0.895

20

9.10

11.69

-2.59

0.154

-1.143

-1.153

21

8.80

9.05

-0.25

0.054

-0.106

-0.103

b).

The largest x value is 14. After removing this point, the regression line is

Y=3.7438+0.7145x

Regression Analysis

0.277

n

20

r

0.527

k

1

Std. Error

2.473

Dep. Var.

y

ANOVA table

Source

SS

df

MS

F

p-value

Regression

42.2694

1  

42.2694

6.91

.0170

Residual

110.0601

18  

6.1145

Total

152.3295

19  

Regression output

confidence interval

variables

coefficients

std. error

   t (df=18)

p-value

95% lower

95% upper

Intercept

3.7438

1.6191

2.312

.0328

0.3422

7.1455

x

0.7145

0.2717

2.629

.0170

0.1436

1.2854

c).

The point(6,15.5) is considered as outlier.

After removing this point, the regression line is

Y=2.546+0.8773x

Regression Analysis

0.663

n

20

r

0.814

k

1

Std. Error

1.805

Dep. Var.

y

ANOVA table

Source

SS

df

MS

F

p-value

Regression

115.4571

1  

115.4571

35.44

1.24E-05

Residual

58.6409

18  

3.2578

Total

174.0980

19  

Regression output

confidence interval

variables

coefficients

std. error

   t (df=18)

p-value

95% lower

95% upper

Intercept

2.5460

0.9720

2.619

.0174

0.5039

4.5881

x

0.8773

0.1474

5.953

1.24E-05

0.5677

1.1870

d).

The model with removing the outlier (6,15.5) is better model. This model has larger R square than the first model.

Regression Analysis

0.501

n

21

r

0.708

k

1

Std. Error

2.460

Dep. Var.

y

ANOVA table

Source

SS

df

MS

F

p-value

Regression

115.4571

1  

115.4571

19.08

.0003

Residual

114.9610

19  

6.0506

Total

230.4181

20  

Regression output

confidence interval

variables

coefficients

std. error

   t (df=19)

p-value

95% lower

95% upper

Intercept

2.9122

1.3192

2.208

.0398

0.1511

5.6733

x

0.8773

0.2008

4.368

.0003

0.4570

1.2977

Studentized

Studentized

Deleted

Observation

y

Predicted

Residual

Leverage

Residual

Residual

1

7.60

5.54

2.06

0.108

0.885

0.879

2

4.60

6.42

-1.82

0.074

-0.770

-0.761

3

6.10

8.18

-2.08

0.048

-0.865

-0.859

4

9.10

8.18

0.92

0.048

0.385

0.376

5

4.30

5.54

-1.24

0.108

-0.535

-0.525

6

7.30

9.93

-2.63

0.074

-1.112

-1.119

7

5.20

7.30

-2.10

0.054

-0.877

-0.872

8

10.40

9.93

0.47

0.074

0.198

0.193

9

15.50

8.18

7.32

0.048

3.051

4.158

10

5.80

4.67

1.13

0.154

0.501

0.491

11

10.70

8.18

2.52

0.048

1.051

1.054

12

5.50

6.42

-0.92

0.074

-0.389

-0.381

13

6.10

5.54

0.56

0.108

0.239

0.233

14

10.70

9.93

0.77

0.074

0.325

0.317

15

10.40

8.18

2.22

0.048

0.926

0.923

16

4.60

6.42

-1.82

0.074

-0.770

-0.761

17

7.00

9.05

-2.05

0.054

-0.858

-0.852

18

6.10

8.18

-2.08

0.048

-0.865

-0.859

19

16.80

15.19

1.61

0.474

0.900

0.895

20

9.10

11.69

-2.59

0.154

-1.143

-1.153

21

8.80

9.05

-0.25

0.054

-0.106

-0.103