I need to run an independent t-test on a categorical and continous variable in R
ID: 3252581 • Letter: I
Question
I need to run an independent t-test on a categorical and continous variable in R commander. However, the categorical variable is not showing up in the Groups section (for the indepedent t-test). In fact, only three of my categorical variables are showing up (and I have many more in the dataset). I have checked to make sure the variables are factors, and they seem to be. They do not show up in numerical summaries, and if I try to Convert numeric variables to factors, they don't show up as an option there either, so they must already be factors.
Is there something else I need to do to make sure they show up as a group for the t-test?
Data Options Response Variable (pick one) Groups (pick one) AgeGrpA r BM Gender C r BMI r num PCCScale PCCScale top10 pCCScaler SelfAge SelfAge r Help Reset OK Cancel ApplyExplanation / Answer
There are a number of things that can be done to make sure your data is in the right format in which you want R to interpret. Here I have assumed the dataset name to be 'data'.
1. Check the data type of all the variables by using the command str(data). Make sure the numerical variables are 'num' and categorical are either 'char' or 'factor'
2. In order to convert variables to factor, use as.factor(data$var1).
You can also use the Recode option to make a categorical variable a Factor. To do so, click through the following menu selections: Data Manage variables in active data set Recode variables. When setting up the recoding, check Make each new variable a factor box
3. Additional checks to be done on data:
It is a good practice to do an Exploratory Data analysis on the data before you proceed to perform your statistical tests. You can check for data distributions and anomalies while doing this.
Check for constant variables (variables with only one value for all observations) and also check for the presence of NAs or missing values in the data. This might be a potential reason for those variables not featuring in Groups section. If there are NAs present in variables, normally they won't show up in numerical summaries. Data cleaning and outlier treatments should be done before performing the statistical tests.