Fortune magazine publishes an annual list of the 100 best companies to work for.
ID: 3320508 • Letter: F
Question
Fortune magazine publishes an annual list of the 100 best companies to work for. The data in the file named FortuneBest shows a portion of the data for a random sample of 30 of the companies that made the top 100 list for 2012 (Fortune, February 6, 2012). The column labeled Rank shows the rank of the company in the Fortune 100 list; the column labeled Size indicates whether the company is a small, midsize, or large company; the column labeled Salaried ($1000s) shows the average annual salary for salaried employees rounded to the nearest$1000 ; and the column labeled Hourly ($1000s) shows the average annual salary for hourly employees rounded to the nearest $1000 . Fortune defines large companies as having more than 10,000employees, midsize companies as having between 2500 and 10,000 employees, and small companies as having fewer than 2500
To incorporate the effect of size, a categorical variable with three levels, we used two dummy variables: Size-Midsize and Size-Small. The value of size-Midsize =1 if the company is a midsize company and 0 otherwise. And, the value of size-small =1 if the company is a small company and 0 otherwise. Develop an estimated regression equation that could be used to predict the average annual salary for salaried employees given the average annual salary for hourly employees and the size of the company.
1. Interpret the regression constant and regression coefficients.
2. Interpret the coefficient of determination. Does R Square only apply to significant independent variables or all independent variables?
3. Interpret the Multiple Correlation Coefficient
4. For the estimated regression equation developed above, use the t test to determine the significance of the independent variables. Use Alpha =0.05
5. Do a global overall test.
The Summary Output of the data file is as follows.
Regression Statistics Multiple R 0.758226532 R Square 0.574907473 Adjusted R Square 0.525858336 Standard Error 25.47515084 Observations 30 ANOVA df SS MS F Significance F Regression 3 22820.30059 7606.766865 11.72105159 4.81713E-05 Residual 26 16873.56607 648.9833105 Total 29 39693.86667 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 26.96589812 14.00431214 1.925542494 0.065164701 -1.820377748 55.75217398 -1.820377748 55.75217398 Hourly ($1000s) 1.224044868 0.258106065 4.742410332 6.63374E-05 0.693500254 1.754589482 0.693500254 1.754589482 Size-Midsize -3.208216286 12.63462389 -0.253922579 0.801552712 -29.17905763 22.76262506 -29.17905763 22.76262506 Size-Small 34.40215452 10.437669 3.295961437 0.0028371 12.94721862 55.85709042 12.94721862 55.85709042Explanation / Answer
1. Interpret the regression constant and regression coefficients.
y^ = 26.965 + 1.224*hourly -3.2082*size_midsize + 34.40215*size_small
when x is increased by 1 unit, then change in y is by b units
hence when hourly is increased by 1 unit, y is increased by 1.224 units
similarly on average y is 3.2082 less when one is size_midsize.
2. Interpret the coefficient of determination. Does R Square only apply to significant independent variables or all independent variables?
R^2 = 0.57490
it means that 57.49% of variation in y is explained by this model
all independent variables
3. Interpret the Multiple Correlation Coefficient
0.75822653 - It is the correlation between the variable's values and the best predictions that can be computed linearly from the predictive variables.
4. For the estimated regression equation developed above, use the t test to determine the significance of the independent variables. Use Alpha =0.05
here size-midsize is not significant ,as its p-value = 0.80>0.05
rest two variables are significant
5. Do a global overall test.
overall p-value = 4.81713E-05 << 0.05
hence the model is overall significant
Please rate