Outputting Regression results into a data frame in

2019-07-06 19:37发布

问题:

I was wondering if there is any command that can output the results of a lm model into a data frame in R like outest in SAS. Any ideas? I am running multiple models and I want the result to look like below -

Model  |  alpha   | Beta | Rsquared | F |  df |
model0 |  8.4     | ...  | ....     | ..|  .. |
model1 |  ...     | ...  | ....     | ..|  .. |
model2 |  ...     | ...  | ....     | ..|  .. |

The data i have is 'ds' which is -

X1 | X2 | Y1 |
.. | .. | .. |
.. | .. | .. |
.. | .. | .. |
.. | .. | .. |

And my code is a simple lm code -

model0 <- lm(Y1 ~ X1, ds)
model1 <- lm(Y1 ~ 1, ds)
model2 <- lm(Y1 ~ X1 + X2, ds)

回答1:

I do exactly the same thing. The difficulty here is of course if the models have different number of coefficients - then you would have different number of columns, which is impossible in data.frame. You need to have the same number of columns for each model.

I normally use it for glm (these code snippets are commented out) but I modified it for lm for you:

models <- c()

for (i in 1:10) {

    y <- rnorm(100) # generate some example data for lm
    x <- rnorm(100)
    m <- lm(y ~ x)

    # in case of glm:
    #m <- glm(y ~ x, data = data, family = "quasipoisson")
    #overdispersion <- 1/m$df.residual*sum((data$count-fitted(m))^2/fitted(m))

    coef <- summary(m)$coef
    v.coef <- c(t(coef))
    names(v.coef) <- paste(rep(rownames(coef), each = 4), c("coef", "stderr", "t", "p-value"))
    v.model_info <- c(r.squared = summary(m)$r.squared, F = summary(m)$fstatistic[1], df.res = summary(m)$df[2])

    # in case of glm:
    #v.model_info <- c(overdisp = summary(m)$dispersion, res.deviance = m$deviance, df.res = m$df.residual, null.deviance = m$null.deviance, df.null = m$df.null)

    v.all <- c(v.coef, v.model_info)    
    models <- rbind(models, cbind(data.frame(model = paste("model", i, sep = "")), t(v.all)))

}

I prefer to take data from summary(m). To bundle the data into data.frame, you use the cbind (column bind) and rbind (row bind) functions.



回答2:

You can use the coefficients function:

out = coefficients(lm(mpg ~ wt, mtcars))
out
# (Intercept)          wt 
#   37.285126   -5.344472 
out[1]
# (Intercept) 
#    37.28513 

or for the group of lm objects:

library(plyr)
out = ldply(list(model0, model1, model2), coefficients)
rownames(out) = sprintf('model%d', 0:2)
       (Intercept)        wt
model0    37.28513 -5.344472
model1    37.28513 -5.344472
model2    37.28513 -5.344472

To expand my solution to what you need, you need to:

  1. Find out how to extract the other information you need from an lm object.
  2. Write a custom function which returns a one-row data.frame which contains all the information.
  3. Run it using the ldply syntax I showed.


标签: r sas output lm