Consider the data in problem 2.37 on page 63. This data consists of the caffeine
ID: 3310703 • Letter: C
Question
Consider the data in problem 2.37 on page 63. This data consists of the caffeine content (in milligrams per ounce) for a random sample of 26 energy drinks. It is included in the data file, CAFFEINE, and is also listed below.
3.2 1.5 4.6 8.9 7.1 9.0 9.4 31.2 10.0 10.1 9.9 11.5
11.8. 11.7 13.8 14.0 16.1 74.5 10.8 26.3 17.7 113.3 32.5
14.0 91.6 127.4
a) Construct a 95% confidence interval for the mean caffeine content for the population of energy drinks.
b) Statistical theory (the Central Limit Theorem) indicates that the interval created in part a) may not be reliable if the population shape is highly skewed. Using the data in the sample of 26 drinks, determine if you think this shape is highly skewed. (HINT: Methods for doing so were discussed in chapters 2 and 3. Ultimately, this is a subjective decision and so you just need to state your opinion and attempt to justify it with appropriate statistical evidence).
Explanation / Answer
(a) 95% confidence interval : Sample mean +- t0.025,25(s/n0.5)
where sample mean = (sum of obervations)/n = 691.9/26 = 26.6115
standard deviation : 34.4267
95% CI : 26.6115 +- 2.060 (34.4267/260.5)
95% CI : (12.7031 , 40.5199)
(b) Skewness of sample = M3/s3
Where M3 is 3rd moment about mean and s is the standard deviation
M3 = (1/n)(Sum(x-sample mean)3) = 75435.33
Skewness = 75435.33/34.42673 = 1.8489
Hence the data is highly skewed (positively) and has long tail to the right
Skewness is a measure of the symmetry in a distribution. A symmetrical dataset will have a skewness equal to 0. So, a normal distribution will have a skewness of 0.
So,this data is assymetric