predict method for felm from lfe package

2019-02-11 15:52发布

问题:

Does anyone have a nice clean way to get predict behavior for felm models?

library(lfe)
model1 <- lm(data = iris, Sepal.Length ~ Sepal.Width + Species)
predict(model1, newdata = data.frame(Sepal.Width = 3, Species = "virginica"))
# Works

model2 <- felm(data = iris, Sepal.Length ~ Sepal.Width | Species)
predict(model2, newdata = data.frame(Sepal.Width = 3, Species = "virginica"))
# Does not work

回答1:

As a workaround, you could combine felm, getfe, and demeanlist as follows:

library(lfe)

lm.model <- lm(data=demeanlist(iris[, 1:2], list(iris$Species)), Sepal.Length ~ Sepal.Width)
fe <- getfe(felm(data = iris, Sepal.Length ~ Sepal.Width | Species))
predict(lm.model, newdata = data.frame(Sepal.Width = 3)) + fe$effect[fe$idx=="virginica"]

The idea is that you use demeanlist to center the variables, then lm to estimate the coefficient on Sepal.Width using the centered variables, giving you an lm object over which you can run predict. Then run felm+getfe to get the conditional mean for the fixed effect, and add that to the output of predict.



回答2:

This might not be the answer that you are looking for, but it seems that the author did not add any functionality to the lfe package in order to make predictions on external data by using the fitted felm model. The primary focus seems to be on the analysis of the group fixed effects. However, it's interesting to note that in the documentation of the package the following is mentioned:

The object has some resemblance to an 'lm' object, and some postprocessing methods designed for lm may happen to work. It may however be necessary to coerce the object to succeed with this.

Hence, it might be possible to coerce the felm object to an lm object in order to obtain some additional lm functionality (if all the required info is present in the object to perform the necessary computations).

The lfe package is intended to be run on very large datasets and effort was made to conserve memory: As a direct result of this, the felm object does not use/contain a qr decomposition, as opposed to the lm object. Unfortunately, the lm predict procedure relies on this information in order to compute the predictions. Hence, coercing the felm object and executing the predict method will fail:

> model2 <- felm(data = iris, Sepal.Length ~ Sepal.Width | Species)
> class(model2) <- c("lm","felm") # coerce to lm object
> predict(model2, newdata = data.frame(Sepal.Width = 3, Species = "virginica"))
Error in qr.lm(object) : lm object does not have a proper 'qr' component.
 Rank zero or should not have used lm(.., qr=FALSE).

If you really must use this package to perform the predictions then you could maybe write your own simplified version of this functionality by using the information that you have available in the felm object. For example, the OLS regression coëfficients are available via model2$coefficients.



回答3:

This should work for cases where you wish to ignore the group effects in the prediction, are predicting for new X's, and and only want confidence intervals. It first looks for a clustervcv attribute, then robustvcv, then vcv.

predict.felm <- function(object, newdata, se.fit = FALSE,
                         interval = "none",
                         level = 0.95){
  if(missing(newdata)){
    stop("predict.felm requires newdata and predicts for all group effects = 0.")
  }

  tt <- terms(object)
  Terms <- delete.response(tt)
  attr(Terms, "intercept") <- 0

  m.mat <- model.matrix(Terms, data = newdata)
  m.coef <- as.numeric(object$coef)
  fit <- as.vector(m.mat %*% object$coef)
  fit <- data.frame(fit = fit)

  if(se.fit | interval != "none"){
    if(!is.null(object$clustervcv)){
      vcov_mat <- object$clustervcv
    } else if (!is.null(object$robustvcv)) {
      vcov_mat <- object$robustvcv
    } else if (!is.null(object$vcv)){
      vcov_mat <- object$vcv
    } else {
      stop("No vcv attached to felm object.")
    }
    se.fit_mat <- sqrt(diag(m.mat %*% vcov_mat %*% t(m.mat)))
  }
  if(interval == "confidence"){
    t_val <- qt((1 - level) / 2 + level, df = object$df.residual)
    fit$lwr <- fit$fit - t_val * se.fit_mat
    fit$upr <- fit$fit + t_val * se.fit_mat
  } else if (interval == "prediction"){
    stop("interval = \"prediction\" not yet implemented")
  }
  if(se.fit){
    return(list(fit=fit, se.fit=se.fit_mat))
  } else {
    return(fit)
  }
}


回答4:

I think what you're looking for might be the lme4 package. I was able to get a predict to work using this:

library(lme4)
data(iris)

model2 <- lmer(data = iris, Sepal.Length ~ (Sepal.Width | Species))
predict(model2, newdata = data.frame(Sepal.Width = 3, Species = "virginica"))
       1 
6.610102 

You may have to play around a little to specify the particular effects you're looking for, but the package is well-documented so it shouldn't be a problem.



标签: r predict lfe