Run glm.mids on a subset of imputed data from mice

2019-08-03 09:51发布

问题:

I get an error when I try to run glm.mids on a subset of a mids imputation object:

library(mice)
imp2 = mice(nhanes)
glm.mids( (hyp==2)~bmi+chl, data=imp2, subset=(age==1) )

gives the cryptic error message

"Error in eval(expr, envir, enclos) :
..1 used in an incorrect context, no ... to look in"

even though the syntax works with regular glm on the original dataset:

glm( (hyp==2)~bmi+chl, data=nhanes, subset=(age==1) )

The documentation ?glm.mids doesn't specifically address subset but says that you can pass additional parameters onto glm. If I can't use subset with glm.mids, is there a good way to subset the mids list object directly?

回答1:

I have taken the liberty of rewriting glm.mids. It is a bit kludgy. The issue seems to stem from the implicit nature by which attributes are passed into glm.

also see these post:

https://stat.ethz.ch/pipermail/r-help/2003-November/041537.html

http://r.789695.n4.nabble.com/Question-on-passing-the-subset-argument-to-an-lm-wrapper-td3009725.html

library(mice)

glm.mids=function (formula, family = gaussian, data, ...) 
{
  call <- match.call()
  if (!is.mids(data)) 
    stop("The data must have class mids")
  analyses <- as.list(1:data$m)
  for (i in 1:data$m) {
    data.i <- complete(data, i)
    analyses[[i]] <- do.call("glm",list(formula=quote(formula),family=quote(family),data=quote(data.i),...))
  }
  object <- list(call = call, call1 = data$call, nmis = data$nmis, 
                 analyses = analyses)
  oldClass(object) <- c("mira", "glm", "lm")
 return(object)
}

imp2 = mice(nhanes)
glm.mids( (hyp==2)~bmi+chl, data=imp2 ,subset=quote(age==1))

The only part that I rewrote was the glm function call within glm.mids analyses[[i]] <- do.call("glm",list(formula=quote(formula),family=quote(family),data=quote(data.i),...))

In the old version it read analyses[[i]] <- glm(formula, family = family, data = data.i,...)



回答2:

Solution is to use

with(data=imp2, exp=glm((hyp==2)~bmi+chl, family=binomial , subset=(age==1) ))


(I think) the problem in your question is the use of ... within the glm.mids function. They are used in the function argument to allow “Additional parameters passed to glm”. However, when ... are passed to the glm call in the glm.mids function they are not processed this way. In ?glm the ... are “For glm: arguments to be used to form the default control argument if it is not supplied directly.”. So the additional arguments will not work.

To see this, simplify the function

f1 <- function (formula, family = binomial, data, ...) 
{
 glm(formula, family = family, data = data, ...)
  }

f1(formula=((hyp==2)~bmi+chl), data=nhanes, subset=(age==2)) 
#Error in eval(expr, envir, enclos) : 
#  ..1 used in an incorrect context, no ... to look in

So the subset argument is not passed to the glm function call

Using the answer from R : Pass argument to glm inside an R function we can slightly alter the function

f2 <- function (formula, family = binomial, data, ...) 
{
  eval(substitute(glm(formula, family = family, data = data, ...)))
}

# This now runs
f2(formula=((hyp==2)~bmi+chl), data=nhanes, subset=(age==2))

# check
glm((hyp==2)~bmi+chl, data=nhanes, family="binomial", subset=(age==2))

The use of substitute will substitute the arguments from the function environment (This needs more details - please feel free to update)



标签: r subset r-mice