Illustration 7.3 (p. 262-4) describes time-series forecasting of new home sales,
ID: 3182949 • Letter: I
Question
Illustration 7.3 (p. 262-4) describes time-series forecasting of new home sales, but you can see that the data is old. Click here (https://www.census.gov/construction/nrs/historical_data/index.html) and download the first table: Houses Sold – Seasonal Factors, Total (Excel file is sold_cust.xls). Look at the monthly data on the “Reg Sold” tab.
Only keep the dates beginning in January 2008, so delete the earlier observations. Keep only the US data, both the seasonally unadjusted monthly (column B) and the seasonally adjusted annual (column G). Make a new column of seasonally adjusted monthly by dividing the annual data by 12. Make a column called “t” similar to the book’s column 4 on page 262 (t will go from 1 to 110 through Feb. 2017); make a t2 column too (since, if you look at the data, you can see sales dropping until about mid-2011 then rising again; hence the quadratic). Also make a column “D” that is a dummy variable equal to one during the spring and summer months, similar to the book’s column 5.
Determine the correlation between the unadjusted and the adjusted monthly data (=CORREL(unadjust., adjust.) in Excel), and produce scatterplots (with connectors) of both.
Run four regressions:
seasonally unadjusted monthly as the dependent, and t and t2 as the independents,
seasonally unadjusted monthly as the dependent, and t, t2, and D as the independents,
seasonally adjusted monthly as the dependent, and t and t2 as the independents, and
seasonally adjusted monthly as the dependent, and t, t2, and D as the independents.
In interpreting your p-values, remember that, say, 1.0E-08 is 1.0 * 10^-8, which is 0.00000001
1. In comparing the regression results between models 2 and 3, it is notable that
Select one:
a. including the D variable in model 2 results in a much larger adjusted R2, suggesting that the inclusion of the dummy variable is necessary to boost predictive power.
b. the coefficient estimates for t and t2 change dramatically, even though the models are very comparable (unadjusted with a seasonal dummy is pretty close to seasonally adjusted).
c. dropping the D variable in model 3 pulls the R2 down, which is unexpected since D in model 2 is statistically insignificant.
d. the D variable in model 2 does a decent job of capturing the seasonal effect, since the results between the two models are not hugely different and D has the expected sign and is statistically significant.
2. The regression results for model 4 are notable because
Select one:
a. making the seasonal adjustment in the dependent variable, in addition to adding the D dummy, yields the best results in terms of significant coefficients, explanatory power, and expected signs.
b. adding the redundant D variable to the seasonally adjusted data causes the coefficient estimates for t and t2to be dramatically different than they were in models 2 and 3.
c. the adjusted R2 is higher than in the comparable model 3 (without the D).
d. adding a redundant seasonal dummy to already seasonally-adjusted data results in the D variable being insignificant, as expected, and the model's explanatory power is essentially the same as models 2 and 3.
Explanation / Answer
1 ans ) the coefficient estimates for t and t2 change dramatically, even though the models are very comparable (unadjusted with a seasonal dummy is pretty close to seasonally adjusted).
2 ans) adding a redundant seasonal dummy to already seasonally-adjusted data results in the D variable being insignificant, as expected, and the model's explanatory power is essentially the same as models 2 and 3.