Assignment 2: Correlation, simple linear, and Multiple Regression Analysis Multi
ID: 3305180 • Letter: A
Question
Assignment 2: Correlation, simple linear, and Multiple Regression Analysis
Multiple regression analysis is widely used in business research in order to forecast and predict purposes. It is also used to determine what independent variables have an influence on dependent variables, such as sales.
Sales can be attributed to quality, customer service, and location. In multiple regression analysis, we can determine which independent variable contributes the most to sales; it could be quality or customer service or location.
Now, consider the following scenario. You have been assigned the task of creating a multiple regression equation of at least three variables that explains Microsoft’s annual sales.
Use a time series of data of at least 10 years. You can search for this data using the Internet.
Before running the regression analysis , predict what sign each variable will be and explain why you made that prediction.
Run three simple linear regressions by considering one independent variable at a time
After running each of the three linear regressions, interpret the regression.
Does the regression fit the data well?
Run a multiple regression using all three independent variables.
Interpret the multiple regression. Does the regression fit the data well?
Does each predictor play a significant role in explaining the significance of the regression?
Are some predictors not useful?
If so, did you consider removing those and rerunning the regression?
Are the predictors related too significantly to one another? What is the coefficient of correlation “r”? Do you think this “r” value suggests a strong correlation among the predictors ( the independent variables?
All questions are important.
The analysis I have:
The regression analysis: Y vs. X1
SE Coef
S= 25.4009 R-Sq= 66.0%R-Sq (adj)= 61.8%
Analysis of Variance
Residual Error
Estimated regression equation
Estimate of Y when (x1) = 45:
^y= 45.1+1.94(X1)
Regression Analysis: Y vs. X2
Y=85.2+4.32X2
Predictor
Coef
SE Coef
T
P
Constant
85.22
38.35
2.22
0.057
X2
4.321
2.864
1.51
0.170
S= 38.4374 R-Sq= 22.2% R-Sq (adj)= 12.4%
Analysis of Variance
Source
DF
SS
MS
F
P
Regression
1
3363
3363
2.28
0.170
Residual Error
8
11819
1477
Total
9
15183
^y=85.2+4.32(X1)
=85.2+4.32(15)
=150
Regression Analysis: Y vs. X1, X2
Y=-18.4+2.01(X1)+4.74(X2)
Predictor
Coef
SE Coef
T
P
Constant
-18.37
17.97.
-1.02
0.341
X1
2.0102
0.2471
8.13
0.000
X2
4.7378
0.9484
5.00
0.002
S=12.7096 R-Sq= 92.6% R-Sq (adj)= 90.4%
Analysis of Variance
Source
DF
SS
MS
F
P
Regression
2
14052.2
7026.1
43.50
0.000
Residual Error
7
1130.7
161.5
Total
9
15182.9
Source
DF
Seq. SS
X1
1
10021.2
X2
1
4030.9
Estimated regression equation:
^Y= -18.4+2.01(X1) + 4.74(X2)
Y= If (X1)=45, (X2)=15
^y= -18.4+2.01(X1)+4.74(X2)
=-18.4+2.01(45)+4.74(15)
=143.15
The microsoft annual sales details i am using:
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
51.12
60.42
58.44
62.48
69.94
73.72
77.85
86.83
93.58
85.32
Predictor CoefSE Coef
T P Constant 45.06 25.42 1.77 0.114 X1 1.9436 0.4932 3.94 0.004Explanation / Answer
let level of significance is 5% i.e. alpha=0.05
since p-value=0.004 of the F of the model1 ( ^y= 45.1+1.94(X1) ) is less than alpha=0.05, so model is good explaining the variation in y.
since p-value=0.17 of the F of the model2 ( Y=85.2+4.32X2) is more than alpha=0.05, so model is not good explaining the variation in y.
since p-value < 0.001 of the F of the model3( ^y= -18.4+2.01(X1)+4.74(X2) ) is less than alpha=0.05, so model is good explaining the variation in y.
each predictor play a significant role in explaining the significance of the regression model 3, as the p-value of each of X1 and X2 is less than alpha=0.05. no need of deleting any of variable
neither raw data of X1 and X2 nor corrlelation between X1 and X2 is not supplied , so difficult to answer about correlation r