In the week 2 lab, we found the mean and the standard deviation for the HEIGHT v
ID: 3125162 • Letter: I
Question
In the week 2 lab, we found the mean and the standard deviation for the HEIGHT variable for both males and females. Use those values for follow these directions to calculate the numbers again.
The numbers from the week 2 assignment are :
Females Mean 67.06, Standard deviation 3.11
Males Mean 69.67, Standard deviation 3.31
And the lab Data is at the bottom of the assignment
(From week 2 lab: Calculate descriptive statistics for the variable Height by Gender. Click on Insert and then Pivot Table. Click in the top box and select all the data (including labels) from Height through Gender. Also click on “new worksheet” and then OK. On the right of the new sheet, click on Height and Gender, making sure that Gender is in the Rows box and Height is in the Values box. Click on the down arrow next to Height in the Values box and select Value Field Settings. In the pop up box, click Average then OK. Write these down. Then click on the down arrow next to Height in the Values box again and select Value Field Settings. In the pop up box, click on StdDev then OK. Write these values down.)
You will also need the number of males and the number of females in the dataset. You can either use the same pivot table created above by selecting Count in the Value Field Settings, or you can actually count in the dataset.
Then in Excel (somewhere on the data file or in a blank worksheet), calculate the maximum error for the females and the maximum error for the males. To find the maximum error for the females, type =CONFIDENCE.T(0.05,stdev,#), using the females’ height standard deviation for “stdev” in the formula and the number of females in your sample for the “#”. Then you can use a calculator to add and subtract this maximum error from the average female height for the 95% confidence interval. Do this again with 0.01 as the alpha in the beginning of the formula to find the 99% confidence interval.
Find these same two intervals for the male data by using the same formula, but using the males’ standard deviation for “stdev” and the number of males in your sample for the “#”.
Give and interpret the 95% confidence intervals for males and females on the HEIGHT variable. Which is wider and why? (9 points)
Give and interpret the 99% confidence intervals for males and females on the HEIGHT variable. Which is wider and why? (9 points)
Find the mean and standard deviation of the DRIVE variable by using =AVERAGE(A2:A36) and =STDEV(A2:A36). Assuming that this variable is normally distributed, what percentage of data would you predict would be less than 40 miles? This would be based on the calculated probability. Use the formula =NORM.DIST(40, mean, stdev,TRUE). Now determine the percentage of data points in the dataset that fall within this range. To find the actual percentage in the dataset, sort the DRIVE variable and count how many of the data points are less than 40 out of the total 35 data points. That is the actual percentage. How does this compare with your prediction? (12 points)
Mean ______________ Standard deviation ____________________
Predicted percentage ______________________________
Actual percentage _____________________________
Comparison ___________________________________________________
______________________________________________________________
What percentage of data would you predict would be between 40 and 70 and what percentage would you predict would be more than 70 miles? Subtract the probabilities found through =NORM.DIST(70, mean, stdev, TRUE) and=NORM.DIST(40, mean, stdev, TRUE) for the “between” probability. To get the probability of over 70, use the same =NORM.DIST(70, mean, stdev, TRUE) and then subtract the result from 1 to get “more than”. Now determine the percentage of data points in the dataset that fall within this range, using same strategy as above for counting data points in the data set. How do each of these compare with your prediction and why is there a difference? (12 points)
Predicted percentage between 40 and 70 ______________________________
Actual percentage _____________________________________________
Predicted percentage more than 70 miles ________________________________
Actual percentage ___________________________________________
Comparison ____________________________________________________
_______________________________________________________________
Why? __________________________________________________________
________________________________________________________________
Lab Data :
Explanation / Answer
Females:
Mean = 67.06
Standard deviation = 3.11
N = 17
Males:
Mean = 69.67
Standard deviation = 3.31
N = 18
95% confidence intervals for males on the HEIGHT variable : (68.14, 71.19), width = 3.05
95% confidence intervals for females on the HEIGHT variable : (65.58, 68.54), width = 2.96
The average height of males will lie in (68.14, 71.19) and average height of females will lie in (65.58, 68.54) 95% of the times.
The interval for males is wider because standard deviation of males is greater than standard deviation of females.
99% confidence intervals for males on the HEIGHT variable : (67.66, 71.67), width = 4.01
99% confidence intervals for females on the HEIGHT variable : (65.11, 69.00), width = 3.89
The average height of males will lie in (67.66, 71.67) and average height of females will lie in (65.11, 69.00) 99% of the times.
The interval for males is wider because standard deviation of males is greater than standard deviation of females.
DRIVE variable
Mean = 51.66
Standard deviation = 25.8
Predicted percentage ( less than 40 miles) = 0.33
Actual percentage (less than 40 miles ) = 14/35 = 0.40
Comparison = Predicted probability is less than actual percentage
Predicted percentage between 40 and 70 = 0.43
Actual percentage = 8/35 = 0.23
Comparison = Predicted probability is greater than actual percentage
Predicted percentage more than 70 miles = 0.24
Actual percentage = 13/35 = 0.37
Comparison = Predicted probability is less than actual percentage
Reason for difference: The sample is small here, which is causing the difference. It is only 35 here.