Using loops to do Chi-Square Test in R

2019-07-18 12:26发布

问题:

I am new to R. I found the following code for doing univariate logistic regression for a set of variables. What i would like to do is run chi square test for a list of variables against the dependent variable, similar to the logistic regression code below. I found couple of them which involve creating all possible combinations of the variables, but I can't get it to work. Ideally, I want the one of the variables (X) to be the same.

Chi Square Analysis using for loop in R

lapply(c("age","sex","race","service","cancer",
         "renal","inf","cpr","sys","heart","prevad",
         "type","frac","po2","ph","pco2","bic","cre","loc"),

       function(var) {

         formula    <- as.formula(paste("status ~", var))
         res.logist <- glm(formula, data = icu, family = binomial)

         summary(res.logist)
       })

回答1:

Are you sure that the strings in the vector you lapply over are in the column names of the icu dataset?

It works for me when I download the icu data:

system("wget http://course1.winona.edu/bdeppa/Biostatistics/Data%20Sets/ICU.TXT")
icu <- read.table('ICU.TXT', header=TRUE)

and change status to STA which is a column in icu. Here an example for some of your variables:

my.list <- lapply(c("Age","Sex","Race","Ser","Can"),         
       function(var) {
         formula    <- as.formula(paste("STA ~", var))
         res.logist <- glm(formula, data = icu, family = binomial)
         summary(res.logist)
       })

This gives me a list with summary.glm objects. Example:

lapply(my.list, coefficients)
[[1]]
               Estimate Std. Error   z value     Pr(>|z|)
(Intercept) -3.05851323 0.69608124 -4.393903 1.113337e-05
Age          0.02754261 0.01056416  2.607174 9.129303e-03

[[2]]
              Estimate Std. Error    z value     Pr(>|z|)
(Intercept) -1.4271164  0.2273030 -6.2784758 3.419081e-10
Sex          0.1053605  0.3617088  0.2912855 7.708330e-01

[[3]]
              Estimate Std. Error    z value   Pr(>|z|)
(Intercept) -1.0500583  0.4983146 -2.1072198 0.03509853
Race        -0.2913384  0.4108026 -0.7091933 0.47820450

[[4]]
              Estimate Std. Error   z value     Pr(>|z|)
(Intercept) -0.9465961  0.2310559 -4.096827 0.0000418852
Ser         -0.9469461  0.3681954 -2.571858 0.0101154495

[[5]]
                 Estimate Std. Error       z value     Pr(>|z|)
(Intercept) -1.386294e+00  0.1863390 -7.439638e+00 1.009615e-13
Can          7.523358e-16  0.5892555  1.276756e-15 1.000000e+00

If you want to do a chi-square test:

my.list <- lapply(c("Age","Sex","Race","Ser","Can"),function(var)chisq.test(icu$STA, icu[,var]))

or a chi-square test for all combinations of variables:

my.list.all <- apply(combn(colnames(icu), 2), 2, function(x)chisq.test(icu[,x[1]], icu[,x[2]]))

Does this work?



标签: r chi-squared