This question builds upon a previous one which was nicely answered for me here.
R: Grouped rolling window linear regression with rollapply and ddply
Wouldn't you know that the code doesn't quite work when extended to the real data rather than the example data?
I have a somewhat large dataset with the following characteristics.
str(T0_satData_reduced)
'data.frame': 45537 obs. of 5 variables:
$ date : POSIXct, format: "2014-11-17 08:47:35" "2014-11-17 08:47:36" "2014-11-17 08:47:37" ...
$ trial : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
$ vial : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
$ O2sat : num 95.1 95.1 95.1 95.1 95 95.1 95.1 95.2 95.1 95 ...
$ elapsed: num 20 20 20.1 20.1 20.1 ...
The previous question dealt with the desire to apply a rolling regression of O2sat
as a function of elapsed
, but grouping the regressions by the factors trial
and vial
.
The following code is drawn from the answer to my previous question (simply modified for the complete dataset rather than the practice one)
rolled <- function(df) {
rollapplyr(df, width = 600, function(m) {
coef(lm(formula = O2sat ~ elapsed, data = as.data.frame(m)))
}, by = 60, by.column = FALSE)
}
T0_slopes <- ddply(T0_satData_reduced, .(trial,vial), function(d) rolled(d))
However, when I run this code I get a series of errors or warnings (first two here).
Warning messages:
1: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : - not meaningful for factors
I'm not sure where this error comes from as I have shown both elapsed
and O2sat
are numeric, so I am not regressing on factors. However, if I force them both to be numeric within the rolled
function above like this.
...
coef(lm(formula = as.numeric(O2sat) ~ as.numeric(elapsed), data = as.data.frame(m)))
...
I no longer get the errors, however, I don't know why this would solve the error. Additionally, the resulting regressions appear suspect because the intercept terms seem inappropriately small.
Any thoughts on why I am getting these errors and why using as.numeric
seems to eliminate the errors (if potentially still providing inappropriate regression terms)?
Thank you
rollapply
passes a matrix to the function so only pass the numeric columns. Usingrolled
from my prior answer and the setup in that question:Added
Another way to do it is to perform the rollapply over the row indexes instead of over the data frame itself. In this example we have also added the conditioning variables as extra output columns: