Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Hi, I need help with Stats (using R) ```{r functions-to-create} ## This function

ID: 3203498 • Letter: H

Question

Hi, I need help with Stats (using R)

```{r functions-to-create}
## This function generates Y, X1, and X2
## Inputs: n, the number of data points
## cor, the correlation between X1 and X2
## Outputs: a data frame
generate.data <- function(n, cor){
epsilon = rnorm(n)
X = matrix(rnorm(2*n),nrow=n) %*%
matrix(c(1, sqrt(cor),sqrt(cor),1), 2, 2)

beta = c(3, 2, 1)
Y = cbind(1, X) %*% beta + epsilon
df = data.frame(Y, X)
return(df)
}

## This function finds confidence interval widths
## for the Linear model
## Inputs: n, the number of data points
## cor, the correlation between X1 and X2
## Outputs: avg, the average width of the confidence
## intervals for the regression
intervals <- function(n, cor){
df = generate.data(n, cor)
mdl = lm(Y~X1+X2, data=df)
itvals = ## Get the confidences intervals for the bhats (95%)
widths = ## Find the width of each interval
avg = ## get the average width of the intervals excluding the intercept

return(avg)
}

n = 250
cors = seq(.1, 1-1e-5, length.out = 25)
```

Can you please specify how to write the codes indicated by bold and marked by ## in the codes above??

Also,

Question: What is the the marked line in `generate.data` doing? Why did I multiply by some matrix?

Thank you very much!

Explanation / Answer

Given that

intervals <- function(n, cor){
df = generate.data(n, cor)
mdl = lm(Y~X1+X2, data=df)
itvals = ## Get the confidences intervals for the bhats (95%)
widths = ## Find the width of each interval
avg = ## get the average width of the intervals excluding the intercept

return(avg)
}

now

itvals = confint(mdl, 'X1', level=0.95) # get the intervals upper and lower

widths = itvals$upr - itvals-lwr # use the above value to calculate the width as upr - lwr

avg = widths/2 # calculate average of the width

Question: What is the the marked line in `generate.data` doing? Why did I multiply by some matrix?

X = matrix(rnorm(2*n),nrow=n) %*%
matrix(c(1, sqrt(cor),sqrt(cor),1), 2, 2)

we want ot create a dataframe , so we first create a matrix of 1 column and rows n . n being the number of data points

next we multiply this single column of matrix 1 with another matrix of 2x2 with values

1 sqrt(cor)

sqrt(cor) 1

so we multiply these 2 matrixes to finally get a dataframe, Obviously you could have done this using any other methodology or logic that can do the job for you. Its just that the question doesit this way