Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Imagine that you have a data set containing variables describing traffic stops a

ID: 3250128 • Letter: I

Question

Imagine that you have a data set containing variables describing traffic stops and the drivers involved. The unit of analysis is the traffic stop, and the variables include the traffic stop disposition (warning, citation, or arrest – a nominal variable), the race of the driver (a binary variable – white, non-white). You are analyst at the SAPD, and you have been asked to examine if non-whites are more likely to be arrested at traffic stop than whites. One of the ways to do it is to create a cross-tab between the disposition and race variable. Your task in this question is to create a ‘fake’ cross-tab, containing the data likely to support such a hypothesis. You do not have actual data to analyze, you are simply asked to create a cross-table, properly structured, and filled with percentages of your choosing, but suggesting that the suspicion that non-whites are more likely to be arrested might be true. Feel free to put any percentages in the table, as long the distributions are congruent with this hypothesis. Also, include margins percentages.

Explanation / Answer

Marginal percentages are

Total whites with Warning, Citation and Arrest = 30 %

Total non-whites with Warning, Citation and Arrest = 70 %

Total Warning cases = 50%

Total Citation cases = 30%

Total arrest cases = 20%

We use chi-square test of independence to perform the hypothesis testing.

DF = (r - 1) * (c - 1) = (2 - 1) * (3 - 1) = 2

Expected frequencies are calculated as below.

Er,c = (nr * nc) / n
E1,1 = (30 * 50) / 100 = 15
E1,2 = (30 * 30) / 100 = 9
E1,3 = (30 * 20) / 100 = 6
E2,1 = (70 * 50) / 100 = 35
E2,2 = (70 * 30) / 100 = 21
E2,3 = (70 * 20) / 100 = 14

2 = [ (Or,c - Er,c)2 / Er,c ]
2 = (20 - 15)2/15 + (5 - 9)2/9 + (5 - 6)2/6  + (30 - 35)2/35 + (25 - 21)2/21 + (15 - 14)2/14

= 5.156

Chisquare stat at 90% confidence interval is 4.6

As 2 = 5.156 > 4.6, we reject the null hypothesis and conclude that the cases of arrest are more for non-whites.

Warning Citation Arrest Total Whites 20% 5% 5% 30% Non-whites 30% 25% 15% 70% Total 50% 30% 20% 100%