Performing hypothesis tests using random samples is fundamental to statistical i
ID: 3248815 • Letter: P
Question
Performing hypothesis tests using random samples is fundamental to statistical inference. The first part of this problem involves comparing two different diets. Using “ChickWeight” data available in the base R, “datasets” package, execute the following code to prepare a data frame for analysis:
load "Chick Weight dataset data Chick Weight) Create T F vector indicating observations with Time 21 and Diet "1" OR "3 index Chick Weight Time 21 & (Chick Weight*Diet "1" l Chick Weight$Diet "3" Create data frame, result, with the weight and Diet of those observations with "TRUE "indez values result subset (ChickWeight Cindex, 1, select c(weight, Diet Encode "Diet as a factor result Diet factor (result$Diet) The data frame, result," will have chick weights for two diets, identified as diet "1" and "3." Use the data frame, "result," to complete the following item. (a) (4 points) Use the "weight" data for the two diets to test the null hypothesis of equal population weights for the two diets. Test at the 95% confidence level with a two-sided t-test. This can be done using t test in R. Assume equal variances. Display the results Conduct t-test of "weight", according to "Diet" Working with paired data is another common statistical activity. The "ChickWeight" data will be used to illustrate how the weight gain from week 20 to 21 may be analyzed. Use the following code to prepare pre- and post-data from Diet "3" for analysis. load "Chick Weight dataset data (Chick Weight) Create T F vector indicating observations with Diet index Chick Weight$Diet Create vector of "weight for observations where Diet "3" and Time 20 pre K subset ChickWeight index, J, Time 20 select weight) weight Create vector of "weight" for observations where Diet "3" and Time 21 post subsets ChickWeight Cindex, J, Time 21 select weight) $weight (b) (6 points) Conduct a paired t-test and construct a two-sided, 95% confidence interval for the average weight gain from week 20 to week 21. Do not use t.testo. Write the code for determination of the confidence interval endpoints. Present the resulting interval. Determine two-sided, 95% confidence interval for the average weight gain, week 20 to 21Explanation / Answer
(a) data("ChickWeight")
index <- ChickWeight$Time == 21 & (ChickWeight$Diet == "1" | ChickWeight$Diet == "3")
result <- subset(ChickWeight[index,],select = c(weight, Diet))
result$Diet <- factor(result$Diet)
with(result, t.test(weight ~ Diet))
Got the below results -
Welch Two Sample t-test
data: weight by Diet
t = -3.4293, df = 16.408, p-value = 0.003337
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-149.64644 -35.45356
sample estimates:
mean in group 1 mean in group 3
177.75 270.30
(b)
data("ChickWeight")
index <- ChickWeight$Diet == "3"
pre <- subset(ChickWeight[index,],Time == 20, select = weight)$weight
post <- subset(ChickWeight[index,],Time == 21, select = weight)$weight
# Taking mean
pre.mean <- mean(pre)
post.mean <- mean(post)
# Difference in means
mean.weight.gain <- post.mean - pre.mean
# Degree of freedom
n <- length(pre)
df <- n-1
# Calculating standard error
weight.gain <- post - pre
std.dev = sqrt(sum(weight.gain^2)/(n-1))
std.err <- std.dev/sqrt(n)
# Calculating t - stat for 95 % confidence interval
lower.t <- qt(0.025,df)
upper.t <- qt(0.975,df)
# Confidence interval
lower.conf <- mean.weight.gain + (std.err * lower.t)
upper.conf <- mean.weight.gain + (std.err * upper.t)
The output is
> lower.conf
[1] -0.3399927
> upper.conf
[1] 23.13999