Incarceration rates vary from state to state. We must remember that correlation
ID: 3225851 • Letter: I
Question
Incarceration rates vary from state to state. We must remember that correlation is not causation but we can ask: “what state level factors might be associated with the state incarceration rate?
I drew a sample of 17 states using a systematic random sampling method (Chapter 7). Using data from the Bureau of Justice Statistics, the Kaiser Foundation and the FBI, I gathered published data on:
Incarceration Rate per 100,000 Adults (2014) - *Dependent Variable*
Median State Income (2010-2013) - *Independent Variable # 1"
Total Crime Rate from Uniform Crime Reports (Violent crime and Property crime) - "Independent Variable #2"
a.) Before you look at the data (on the next page), state an appropriate null and research hypothesis concerning the relationship between Incarceration Rate and each of the two independent variables (Median Income and Total Crime Rate).
b.) Which of these predicted relationships do you expect to be strongest? Why?
For each of these relationships, calculate: - show work
c.) The prediction equation (least squares line)
d.) The Coefficient of Determination
e.) Pearson’s r
f.) Would multiple regression be a useful statistical tool for further analysis of these data? Why or why not? You do not need to do any calculations.
**I am looking for the answers to questions "c,d,e" for median income and total crime rate**
Incarceration Rate
Median
Violent Crime
Prop crime
Total crime
per 100,00 Adults
Income
Rate
Rate
Rate
2014
2010-2013
2012
2012
2012
State
Alabama
890
43330
449.9
3502.2
3952.1
Arkansas
1020
40877
469.1
3660.1
4129.2
Connecticut
590
67807
283
2140
2423
Florida
960
47106
487.1
3276.7
3763.8
Idaho
910
49952
207.9
1983.5
2191.4
Iowa
530
53364
263.9
2271.8
2535.7
Louisiana
1380
40844
496.9
3540.6
4037.5
Massachusetts
380
64555
405.5
2153
2558.5
Mississippi
1120
40338
260.8
2811
3071.8
Nebraska
600
55107
259.4
2754.9
3014.3
New Jersey
510
65321
290.2
2047.3
2337.5
North Carolina
710
44254
353.4
3369.5
3722.9
Oklahoma
1310
47282
469.3
3401
3870.3
Rhode Island
400
55158
252.4
2572.3
2824.7
Tennessee
920
42785
643.6
3371.4
4015
Vermont
390
56175
142.6
2398.7
2541.3
West Virginia
670
43361
316.3
2364.9
2681.2
Mean
781.76
50448.00
355.96
2801.11
3157.07
SD
313.49
9056.11
130.62
601.63
703.41
Incarceration Rate
Median
Violent Crime
Prop crime
Total crime
per 100,00 Adults
Income
Rate
Rate
Rate
2014
2010-2013
2012
2012
2012
State
Alabama
890
43330
449.9
3502.2
3952.1
Arkansas
1020
40877
469.1
3660.1
4129.2
Connecticut
590
67807
283
2140
2423
Florida
960
47106
487.1
3276.7
3763.8
Idaho
910
49952
207.9
1983.5
2191.4
Iowa
530
53364
263.9
2271.8
2535.7
Louisiana
1380
40844
496.9
3540.6
4037.5
Massachusetts
380
64555
405.5
2153
2558.5
Mississippi
1120
40338
260.8
2811
3071.8
Nebraska
600
55107
259.4
2754.9
3014.3
New Jersey
510
65321
290.2
2047.3
2337.5
North Carolina
710
44254
353.4
3369.5
3722.9
Oklahoma
1310
47282
469.3
3401
3870.3
Rhode Island
400
55158
252.4
2572.3
2824.7
Tennessee
920
42785
643.6
3371.4
4015
Vermont
390
56175
142.6
2398.7
2541.3
West Virginia
670
43361
316.3
2364.9
2681.2
Mean
781.76
50448.00
355.96
2801.11
3157.07
SD
313.49
9056.11
130.62
601.63
703.41
Explanation / Answer
1. Null Hypothesis : Median State Income and Total Crime rate are not related to Incarceration rate
Alternate : Median State Income and TOtal Crime rate are related to Incarceration Rate
2. Looking at the correlation coefficient of both the independent variables: Median State Income is negatively correlated with correlation coefficient of -0.72 and Total Crime rate positively correlated with 0.70 , so Median State Income has strongest relationship
3.The least square line is give by Incarceration Rate = 1090.514-0.01614*Median State Income+0.16012*Total Crime
4. Coefficient of Determination is 58.9%
5. Pearson Correlation Coefficient is suare root of Coefficient of Determination i.e. 58.9% = 0.76
6. Since the R square is very less it doesn't make sense to go ahead with Linear Regression. Better is to look at the influential observations using Cook's distance, remove them from model and then re build the model for improved R square