R: Grouped rolling window linear regression with r

2019-03-02 15:21发布

问题:

I have a data set with several grouping variables on which I want to run a rolling window linear regression. The ultimate goals is to extract the 10 linear regressions with the lowest slopes and average them together to provide a mean minimum rate of change. I have found examples using rollapply to calculate rolling window linear regressions, but I have the added complication that I would like to apply these linear regressions to groups within the data set.

Here is a sample data set and my current code which is close and isn't quite working.

dat<-data.frame(w=c(rep(1,27), rep(2,27),rep(3,27)), z=c(rep(c(1,2,3),27)), 
x=c(rep(seq(1,27),3)), y=c(rnorm(27,10,3), rnorm(27,3,2.2), rnorm(27, 6,1.3)))

where w and z are two grouping variables and x and y are the regression terms.

From my internet searches here is aR basic rolling window linear regression code where the window size is 6, sequential regressions are separated by 3 data points and I am extracting only the slope coef(lm...)[2]

library(zoo)    
slopeData<-rollapply(zoo(dat), width=6, function(Z) { 
coef(lm(formula=y~x, data = as.data.frame(Z), na.rm=T))[2]
}, by = 3, by.column=FALSE, align="right")

Now I wish to apply this rolling window regression to the groups specified by the two grouping variables w and z. So I tried something like this using ddply from plyr package. First I try to rewrite the code above as a function.

rolled<-function(df) {
    rollapply(zoo(df), width=6, function(Z) { 
    coef(lm(formula=y~x, data = as.data.frame(Z), na.rm=T))[2]
    }, by = 3, by.column=FALSE, align="right")
}

And then run apply that function using ddply

groupedSlope <- ddply(dat, .(w,z), function(d) rolled(d))

This, however, doesn't work as I get a series of warnings and errors. I imagine that some of the errors may relate to the combining of zoo formats and data frames and this becomes overly complicated. Its what I have been working on so far, but does anyone know of a means of getting grouped, rolling window linear regressions, potentially simpler than this method?

Thanks for any assistance, Nate

回答1:

1) rollapply works on data frames too so it is not necessary to convert df to zoo.

2) lm uses na.action, not na.rm, and its default is na.omit so we can just drop this argument.

3) rollapplyr is a more concise way to write rollapply(..., align = "right").

Assuming that rolled otherwise does what you want and incorporating these changes into rolled, the ddply statement in the question should work or we could use by from the base of R which we show below:

rolled <- function(df) {
    rollapplyr(df, width = 6, function(m) { 
          coef(lm(formula = y ~ x, data = as.data.frame(m)))[2]
       }, by = 3, by.column = FALSE
   )
}
do.call("rbind", by(dat, dat[c("w", "z")], rolled))


标签: r plyr rollapply