I have a dataframe:
df = data.frame(x1 = rnorm(50), x2 = rnorm(50), x3 = rnorm(50), x4 = rnorm(50))
I would like to regress each variable versus all the other variables, for instance:
fit1 <- lm(x1 ~ ., data = df)
fit2 <- lm(x2 ~ ., data = df)
etc. (Of course, the real dataframe has a lot more variables).
I tried putting them in a loop, but it didn't work. I also tried using lapply
but couldn't produce the desired result either. Does anyone know the trick?
You can use reformulate
to dynamically build formuals
df = data.frame(x1 = rnorm(50), x2 = rnorm(50), x3 = rnorm(50), x4 = rnorm(50))
vars <- names(df)
result <- lapply(vars, function(resp) {
lm(reformulate(".",resp), data=df)
})
alternatively you could use do.call to get "prettier" formauls in each of the models
vars <- names(df)
result <- lapply(vars, function(resp) {
do.call("lm", list(reformulate(".",resp), data=quote(df)))
})
each of these methods returns a list. You can extract individual models with result[[1]]
, result[[2]]
, etc
Or you can try this...
df = data.frame(x1 = rnorm(50), x2 = rnorm(50), x3 = rnorm(50), x4 = rnorm(50))
models = list()
for (i in (1: ncol(df))){
formula = paste(colnames(df)[i], "~ .", sep="")
models[[i]] = lm(formula, data = df)
}
This will save all models as a list
To retrieve stored models:
eg : model regressed on x4
#retrieve model - replace modelName with the name of the required column
modelName = "x4"
out = models[[which( colnames(df)== modelName )]]
Output :
Call:
lm(formula = formula, data = df)
Coefficients:
(Intercept) x1 x2 x3
-0.17383 0.07602 -0.09759 -0.23920