R Multiple Regression Loop and Extract Coefficient

2019-05-30 19:46发布

问题:

I have to perform multiple linear regression for many vectors of dependent variables on the same matrix of independent variables.

For example, I want to create 3 models such that:

lm( d ~ a + b + c )
lm( e ~ a + b + c )
lm( f ~ a + b + c )

from the following matrix (a,b,c are the independent variables and d,e,f are the dependent variables)

       [,1]     [,2]     [,3]     [,4]     [,5]     [,6]
[1,]    a1       b1       c1       d1       e1       f1
[2,]    a2       b2       c2       d2       e2       f2
[3,]    a3       b3       c3       d3       e3       f3

I then want to store the coefficients from the regression in another matrix (I have reduced the number of columns and vectors in my example for ease of explanation).

回答1:

Here's a method that is not very general, but will work if you substitute your own dependent variable names in depvar, and of course the independent variables common to all models in the inner lm() call, and of course the dataset name. Here I have demonstrated on mtcars, a built-in dataset supplied with R.

depvar <- c("mpg", "disp", "qsec")
regresults <- lapply(depvar, function(dv) {
    tmplm <- lm(get(dv) ~ cyl + hp + wt, data = mtcars)
    coef(tmplm)
})
# returns a list, where each element is a vector of coefficients
# do.call(rbind, ) will paste them together
allresults <- data.frame(depvar = depvar, 
                         do.call(rbind, regresults))
# tidy up name of intercept variable
names(allresults)[2] <- "intercept"
allresults
##   depvar  intercept        cyl          hp        wt
## 1    mpg   38.75179 -0.9416168 -0.01803810 -3.166973
## 2   disp -179.04186 30.3212049  0.21555502 59.222023
## 3   qsec   19.76879 -0.5825700 -0.01881199  1.381334

Edit based on suggestion by @Mike Wise:

If you want only a numeric dataset but want to keep the identifier, you can add it as a row.name, like this:

allresults <- data.frame(do.call(rbind, regresults),
                         row.names = depvar)
# tidy up name of intercept variable
names(allresults)[1] <- "intercept"
allresults
##       intercept        cyl          hp        wt
## mpg    38.75179 -0.9416168 -0.01803810 -3.166973
## disp -179.04186 30.3212049  0.21555502 59.222023
## qsec   19.76879 -0.5825700 -0.01881199  1.381334