Incarceration rates vary from state to state. We must remember that correlation
ID: 3223637 • Letter: I
Question
Incarceration rates vary from state to state. We must remember that correlation is not causation but we can ask: “what state level factors might be associated with the state incarceration rate?
Incarceration Rate per 100,000 Adults (2014)
Median State Income (2010-2013)
Total Crime Rate from Uniform Crime Reports (Violent crime and Property crime)
Before you look at the data (on the next page), state an appropriate null and research hypothesis concerning the relationship between Incarceration Rate and each of the two independent variables.
Which of these predicted relationships do you expect to be strongest? Why?
For each of these relationships, calculate:
The prediction equation (least squares line)
The Coefficient of Determination
Pearson’s r
Would multiple regression be a useful statistical tool for further analysis of these data? Why or why not? You do not need to do any calculations.
Incarceration Rate
Median
Violent Crime
Prop crime
Total crime
per 100,00 Adults
Income
Rate
Rate
Rate
2014
2010-2013
2012
2012
2012
State
Alabama
890
43330
449.9
3502.2
3952.1
Arkansas
1020
40877
469.1
3660.1
4129.2
Connecticut
590
67807
283
2140
2423
Florida
960
47106
487.1
3276.7
3763.8
Idaho
910
49952
207.9
1983.5
2191.4
Iowa
530
53364
263.9
2271.8
2535.7
Louisiana
1380
40844
496.9
3540.6
4037.5
Massachusetts
380
64555
405.5
2153
2558.5
Mississippi
1120
40338
260.8
2811
3071.8
Nebraska
600
55107
259.4
2754.9
3014.3
New Jersey
510
65321
290.2
2047.3
2337.5
North Carolina
710
44254
353.4
3369.5
3722.9
Oklahoma
1310
47282
469.3
3401
3870.3
Rhode Island
400
55158
252.4
2572.3
2824.7
Tennessee
920
42785
643.6
3371.4
4015
Vermont
390
56175
142.6
2398.7
2541.3
West Virginia
670
43361
316.3
2364.9
2681.2
Mean
781.76
50448.00
355.96
2801.11
3157.07
SD
313.49
9056.11
130.62
601.63
703.41
Incarceration Rate
Median
Violent Crime
Prop crime
Total crime
per 100,00 Adults
Income
Rate
Rate
Rate
2014
2010-2013
2012
2012
2012
State
Alabama
890
43330
449.9
3502.2
3952.1
Arkansas
1020
40877
469.1
3660.1
4129.2
Connecticut
590
67807
283
2140
2423
Florida
960
47106
487.1
3276.7
3763.8
Idaho
910
49952
207.9
1983.5
2191.4
Iowa
530
53364
263.9
2271.8
2535.7
Louisiana
1380
40844
496.9
3540.6
4037.5
Massachusetts
380
64555
405.5
2153
2558.5
Mississippi
1120
40338
260.8
2811
3071.8
Nebraska
600
55107
259.4
2754.9
3014.3
New Jersey
510
65321
290.2
2047.3
2337.5
North Carolina
710
44254
353.4
3369.5
3722.9
Oklahoma
1310
47282
469.3
3401
3870.3
Rhode Island
400
55158
252.4
2572.3
2824.7
Tennessee
920
42785
643.6
3371.4
4015
Vermont
390
56175
142.6
2398.7
2541.3
West Virginia
670
43361
316.3
2364.9
2681.2
Mean
781.76
50448.00
355.96
2801.11
3157.07
SD
313.49
9056.11
130.62
601.63
703.41
Explanation / Answer
Solution:
Here, we want to check whether there is any significant linear relationship or association exists between the dependent variable incarceration rate and median income or not. Also we want to check whether there is any significant linear relationship or association exists between the dependent variable incarceration rate and independent variable total crime rate or not. We want to check these both hypotheses separately.
For the first test, the null and alternative hypotheses are given as below:
Null hypothesis: H0: There is no any significant relationship exists between the dependent variable incarceration rate and independent variable median income.
Alternative hypothesis: Ha: There is a significant relationship exists between the dependent variable incarceration rate and independent variable median income.
For the second test, the null and alternative hypotheses are given as below:
Null hypothesis: H0: There is no any significant relationship exists between the dependent variable incarceration rate and independent variable total crime rate.
Alternative hypothesis: Ha: There is a significant relationship exists between the dependent variable incarceration rate and independent variable total crime rate.
For the first test, the regression analysis is given as below:
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.72716723
R Square
0.52877218
Adjusted R Square
0.497356992
Standard Error
222.2582281
Observations
17
ANOVA
df
SS
MS
F
Significance F
Regression
1
831466.2592
831466.2592
16.83173693
0.000941189
Residual
15
740980.7996
49398.71997
Total
16
1572447.059
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
2051.650004
314.1869314
6.530029732
9.52105E-06
1381.976414
2721.323593
Median income
-0.025172163
0.006135586
-4.102649989
0.000941189
-0.038249856
-0.012094471
The least square equation for the prediction of the incarceration rate based on the median income is given as below:
Incarceration rate = 2051.65 – 0.025*Median income
The correlation coefficient between the incarceration rate and median income is given as -0.73, which means there is a strong negative linear relationship or association exists between the two variables incarceration rate and median income.
For the given regression model, the p-value is given as 0.0009 which is less than alpha = 0.05, so we reject the null hypothesis that there is no any significant relationship exists between the dependent variable incarceration rate and independent variable median income.
This means we conclude that there is sufficient evidence that there is a significant relationship exists between the dependent variable incarceration rate and independent variable median income.
The coefficient of determination or the value of R square is given as 0.5288, which means about 52.88% of the variation in the dependent variable incarceration rate is explained by the independent variable median income.
Now, we have to check whether there is any significant relationship between the incarceration rate and total crime rate or not.
The regression model is given as below:
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.697872497
R Square
0.487026022
Adjusted R Square
0.452827756
Standard Error
231.8942897
Observations
17
ANOVA
df
SS
MS
F
Significance F
Regression
1
765822.6352
765822.6352
14.24124933
0.001839335
Residual
15
806624.4236
53774.96157
Total
16
1572447.059
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
-200.1681151
266.209335
-0.75191997
0.463735236
-767.5798785
367.2436483
Total Crime rate
0.311026565
0.08241826
3.773757984
0.001839335
0.135356204
0.486696926
The least square equation for the predication of incarceration rate based on total crime rate is given as below:
Incarceration rate = -200.168 + 0.3110*total crime rate
The p-value for this regression model is given as 0.0018 which is less than alpha 0.05, so we reject the null hypothesis that there is no any significant relationship exists between incarceration rate and total crime rate. This means we conclude that there is a statistically significant relationship exists between the incarceration rate and total crime rate.
The coefficient of determination for this regression model is given as 0.4870, which means about 48.70% of the variation in the dependent variable incarceration rate is explained by the independent variable total crime rate.
Multiple regression would be useful statistical tool for further analysis of the given data because we get the significant relationships exists between the dependent variable and independent variables.
The correlation table for more reference is given as below:
Incarceration Rate
Median income
Incarceration Rate
1
Median income
-0.72716723
1
Total Crime rate
0.697872497
-0.726196767
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.72716723
R Square
0.52877218
Adjusted R Square
0.497356992
Standard Error
222.2582281
Observations
17
ANOVA
df
SS
MS
F
Significance F
Regression
1
831466.2592
831466.2592
16.83173693
0.000941189
Residual
15
740980.7996
49398.71997
Total
16
1572447.059
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
2051.650004
314.1869314
6.530029732
9.52105E-06
1381.976414
2721.323593
Median income
-0.025172163
0.006135586
-4.102649989
0.000941189
-0.038249856
-0.012094471