Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Assignment 2: Correlation, simple linear, and Multiple Regression Analysis Multi

ID: 3305180 • Letter: A

Question

Assignment 2: Correlation, simple linear, and Multiple Regression Analysis

Multiple regression analysis is widely used in business research in order to forecast and predict purposes. It is also used to determine what independent variables have an influence on dependent variables, such as sales.

Sales can be attributed to quality, customer service, and location. In multiple regression analysis, we can determine which independent variable contributes the most to sales; it could be quality or customer service or location.

Now, consider the following scenario. You have been assigned the task of creating a multiple regression equation of at least three variables that explains Microsoft’s annual sales.

Use a time series of data of at least 10 years. You can search for this data using the Internet.

Before running the regression analysis , predict what sign each variable will be and explain why you made that prediction.

Run three simple linear regressions by considering one independent variable at a time

After running each of the three linear regressions, interpret the regression.

Does the regression fit the data well?

Run a multiple regression using all three independent variables.

Interpret the multiple regression. Does the regression fit the data well?

Does each predictor play a significant role in explaining the significance of the regression?

Are some predictors not useful?

If so, did you consider removing those and rerunning the regression?

Are the predictors related too significantly to one another? What is the coefficient of correlation “r”? Do you think this “r” value suggests a strong correlation among the predictors ( the independent variables?

All questions are important.

The analysis I have:

The regression analysis: Y vs. X1

SE Coef

S= 25.4009 R-Sq= 66.0%R-Sq (adj)= 61.8%

Analysis of Variance

Residual Error

Estimated regression equation

Estimate of Y when (x1) = 45:

^y= 45.1+1.94(X1)

Regression Analysis: Y vs. X2

Y=85.2+4.32X2

Predictor

Coef

SE Coef

T

P

Constant

85.22

38.35

2.22

0.057

X2

4.321

2.864

1.51

0.170

S= 38.4374 R-Sq= 22.2% R-Sq (adj)= 12.4%

Analysis of Variance

Source

DF

SS

MS

F

P

Regression

1

3363

3363

2.28

0.170

Residual Error

8

11819

1477

Total

9

15183

^y=85.2+4.32(X1)

=85.2+4.32(15)

=150

Regression Analysis: Y vs. X1, X2

Y=-18.4+2.01(X1)+4.74(X2)

Predictor

Coef

SE Coef

T

P

Constant

-18.37

17.97.

-1.02

0.341

X1

2.0102

0.2471

8.13

0.000

X2

4.7378

0.9484

5.00

0.002

S=12.7096 R-Sq= 92.6% R-Sq (adj)= 90.4%

Analysis of Variance

Source

DF

SS

MS

F

P

Regression

2

14052.2

7026.1

43.50

0.000

Residual Error

7

1130.7

161.5

Total

9

15182.9

Source

DF

Seq. SS

X1

1

10021.2

X2

1

4030.9

Estimated regression equation:

^Y= -18.4+2.01(X1) + 4.74(X2)

Y= If (X1)=45, (X2)=15

^y= -18.4+2.01(X1)+4.74(X2)

=-18.4+2.01(45)+4.74(15)

=143.15

The microsoft annual sales details i am using:

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

51.12

60.42

58.44

62.48

69.94

73.72

77.85

86.83

93.58

85.32

Predictor Coef

SE Coef

T P Constant 45.06 25.42 1.77 0.114 X1 1.9436 0.4932 3.94 0.004

Explanation / Answer

let level of significance is 5% i.e. alpha=0.05

since p-value=0.004 of the F of the model1 ( ^y= 45.1+1.94(X1) ) is less than alpha=0.05, so model is good explaining the variation in y.

since p-value=0.17 of the F of the model2 ( Y=85.2+4.32X2) is more than alpha=0.05, so model is not good explaining the variation in y.

since p-value < 0.001 of the F of the model3( ^y= -18.4+2.01(X1)+4.74(X2) ) is less than alpha=0.05, so model is good explaining the variation in y.

each predictor play a significant role in explaining the significance of the regression model 3, as the p-value of each of X1 and X2 is less than alpha=0.05. no need of deleting any of variable

neither raw data of X1 and X2 nor corrlelation between X1 and X2 is not supplied , so difficult to answer about correlation r