Refer to the Real Estate data. Use the selling price of the home as the dependen
ID: 3229466 • Letter: R
Question
Refer to the Real Estate data. Use the selling price of the home as the dependent variable and determine the regression equation with the numbers of bedrooms, size of the house, whether there is a pool, distance from the center of the city, township, whether there is an attached garage, and the number of bathrooms as independent variables.
Write out the regression equation. How much does a garage or an extra bathroom add to the selling price of a home?
Determine the value of R-squared. Provide an interpretation of the variance R-squared represents.
Develop a correlation matrix. Which independent variables have strong or weak correlations with the dependent variable (price)?
Price Bedrooms Square Feet Pool Distance Township Garage Baths 245,400 2 2100 0 12 1 1 2 221,100 3 2300 0 18 1 0 1.5 232,200 3 1900 0 16 1 1 1.5 198,300 4 2100 0 19 1 1 1.5 192,600 6 2200 0 14 1 0 2 147,400 6 1700 0 12 1 0 2 224,000 3 1900 0 6 1 1 2 220,900 2 2300 0 12 1 1 2 199,000 3 2500 0 18 1 0 1.5 139,900 2 2100 1 28 1 0 1.5 224,800 3 2200 1 17 1 1 2.5 216,800 3 2200 1 15 1 1 2 176,000 4 2200 1 15 1 1 2 189,400 4 2200 1 24 1 1 2 125,900 2 2400 1 28 1 0 1.5 192,900 4 1900 0 14 2 1 2.5 166,200 3 2000 0 16 2 1 2 307,800 3 2400 0 21 2 1 3 209,700 5 2200 0 13 2 1 2 207,500 3 2100 0 10 2 0 2 209,700 4 2200 0 19 2 1 2 173,600 4 2100 0 14 2 1 2.5 188,300 6 2100 0 14 2 1 2.5 213,600 2 2200 1 16 2 0 2.5 271,800 2 2100 1 9 2 1 2.5 281,300 3 2100 1 16 2 1 2 247,700 5 2400 1 16 2 1 2 216,000 4 2300 1 19 2 0 2 273,200 5 2200 1 16 2 1 3 251,400 3 1900 1 12 2 1 2 154,300 2 2000 1 13 2 0 2 294,000 2 2100 1 13 2 1 2.5 192,200 2 2400 1 16 2 0 2.5 244,600 2 2300 1 9 2 1 2.5 253,200 3 2300 1 16 2 1 2 172,700 4 2200 0 16 3 0 2 206,000 3 2100 0 9 3 0 1.5 166,500 3 1600 0 19 3 0 2.5 190,900 3 2200 0 18 3 1 2 254,300 4 2500 0 15 3 1 2 176,300 2 2000 0 17 3 0 2 155,400 4 2400 0 16 3 0 2 242,100 3 2300 1 12 3 0 2 327,200 6 2500 1 15 3 1 2 292,400 4 2100 1 14 3 1 2 246,100 4 2100 1 18 3 1 2 194,400 2 2300 1 11 3 0 2 233,000 3 2200 1 14 3 1 1.5 234,000 2 1700 1 19 3 1 2 199,800 3 2100 1 19 3 1 2 236,400 5 2200 1 20 3 1 2 172,400 3 2200 1 23 3 0 2 246,000 6 2300 1 7 3 1 3 312,100 7 2400 1 13 3 1 3 289,800 6 2000 1 21 3 1 3 217,800 3 2500 1 12 3 0 2 294,500 6 2700 1 15 3 1 2 263,200 4 2300 1 14 3 1 2 221,500 4 2300 1 18 3 1 2 175,000 2 2500 1 11 3 0 2 207,500 5 2300 0 21 4 0 2.5 198,900 3 2200 0 10 4 1 2 209,300 6 1900 0 15 4 1 2 182,700 4 2000 0 14 4 0 2.5 205,100 3 2000 0 20 4 0 2 175,600 4 2300 0 24 4 1 2 171,600 3 2000 0 16 4 0 2 269,900 5 2200 0 11 4 1 2.5 186,700 5 2500 0 21 4 0 2.5 179,000 3 2400 0 10 4 1 2 188,300 6 2100 0 15 4 1 2 182,400 4 2100 1 19 4 0 2 266,600 4 2400 1 13 4 1 2 209,000 2 1700 1 8 4 1 1.5 270,800 6 2500 1 7 4 1 2 252,300 4 2600 1 8 4 1 2 345,300 8 2600 1 9 4 1 2 187,000 2 1900 1 26 4 0 2 257,200 2 2100 1 9 4 1 2 294,300 7 2400 1 8 4 1 2 125,000 2 1900 1 18 4 0 1.5 164,100 4 2300 1 19 4 0 2 240,000 4 2600 1 13 4 1 2 188,100 2 1900 1 8 4 1 1.5 243,700 6 2700 1 7 4 1 2 227,100 4 2900 1 8 4 1 2 310,800 8 2900 1 9 4 1 2 179,000 3 2400 1 8 4 1 2 173,600 4 2100 1 9 4 1 2 263,100 4 2300 0 17 5 1 2 173,100 2 2200 0 21 5 1 1.5 236,800 4 2600 0 17 5 1 2 209,300 5 2100 1 20 5 0 1.5 326,300 6 2100 1 11 5 1 3 180,400 2 2000 1 11 5 0 2 207,100 2 2000 1 11 5 1 2 177,100 2 1900 1 10 5 1 2 312,100 6 2600 1 7 5 1 2.5 269,200 5 2200 1 8 5 1 3 228,400 3 2300 1 17 5 1 1.5 222,100 2 2100 1 9 5 1 2 188,300 5 2300 1 20 5 0 1.5 293,700 6 2400 1 11 5 1 3 227,100 4 2900 1 20 5 0 1.5 188,300 5 2300 1 11 5 1 3Explanation / Answer
Solution:
Required regression analysis and correlation matrix are given as below:
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.730476016
R Square
0.53359521
Adjusted R Square
0.499937132
Standard Error
33310.64487
Observations
105
ANOVA
df
SS
MS
F
Significance F
Regression
7
1.23136E+11
1.7591E+10
15.85340718
1.00797E-13
Residual
97
1.07631E+11
1109599062
Total
104
2.30768E+11
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
43137.24985
39739.27819
1.08550663
0.280387731
-35734.215
122008.7147
Bedrooms
7375.497615
2590.021158
2.84765921
0.005376578
2235.022699
12515.97253
Square Feet
38.62695458
14.75462387
2.6179559
0.010263711
9.343111206
67.91079794
Pool
19111.4418
7126.552713
2.68172321
0.008609583
4967.207745
33255.67585
Distance
-1012.668849
741.384712
-1.3659155
0.175124168
-2484.11224
458.7745423
Township
-1739.007792
2699.416357
-0.6442162
0.520955701
-7096.601891
3618.586306
Garage
35498.01891
7675.838476
4.62464381
1.15902E-05
20263.6047
50732.43313
Baths
23092.54587
9058.307715
2.5493223
0.012360444
5114.31297
41070.77877
Price
Bedrooms
Square Feet
Pool
Distance
Township
Garage
Baths
Price
1
Bedrooms
0.467377108
1
Square Feet
0.371041595
0.383456103
1
Pool
0.29406475
0.005301227
0.20059049
1
Distance
-0.347031166
-0.153355767
-0.1171945
-0.13938244
1
Township
0.128175517
0.200126798
0.18464617
0.201094525
-0.208592968
1
Garage
0.526273941
0.234102158
0.08302732
0.114153335
-0.359294882
0.056667827
1
Baths
0.382172576
0.328930238
0.02436486
0.054532583
-0.194992972
0.049669636
0.221289
1
The regression equation is given as below:
Price = 43137.24985 + 7375.497615* Bedrooms + 38.62695458* Square Feet + 19111.4418* Pool - 1012.668849* Distance - 1739.007792* Township + 35498.01891* Garage + 23092.54587* Baths
There is an increase of $35498.01891 in the selling price as unit increase in garage.
There is an increase of $23092.54587 in the selling price as unit increase in number of bathrooms.
The value of the coefficient of determination or R square is given as 0.53359521, which means about 53.36% of the variation in the dependent variable is explained by the independent variables.
The correlation matrix shows that two variables garage and price have relatively strong relationship or correlation while the Variables Township and price have weakest or lowest correlation or relationship.
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.730476016
R Square
0.53359521
Adjusted R Square
0.499937132
Standard Error
33310.64487
Observations
105
ANOVA
df
SS
MS
F
Significance F
Regression
7
1.23136E+11
1.7591E+10
15.85340718
1.00797E-13
Residual
97
1.07631E+11
1109599062
Total
104
2.30768E+11
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
43137.24985
39739.27819
1.08550663
0.280387731
-35734.215
122008.7147
Bedrooms
7375.497615
2590.021158
2.84765921
0.005376578
2235.022699
12515.97253
Square Feet
38.62695458
14.75462387
2.6179559
0.010263711
9.343111206
67.91079794
Pool
19111.4418
7126.552713
2.68172321
0.008609583
4967.207745
33255.67585
Distance
-1012.668849
741.384712
-1.3659155
0.175124168
-2484.11224
458.7745423
Township
-1739.007792
2699.416357
-0.6442162
0.520955701
-7096.601891
3618.586306
Garage
35498.01891
7675.838476
4.62464381
1.15902E-05
20263.6047
50732.43313
Baths
23092.54587
9058.307715
2.5493223
0.012360444
5114.31297
41070.77877