Say I have a data frame like this:
X <- data_frame(
x = rep(seq(from = 1, to = 10, by = 1), 3),
y = 2*x + rnorm(length(x), sd = 0.5),
g = rep(LETTERS[1:3], each = length(x)/3))
How can I fit a regression y~x
grouped by variable g
and add the values from the fitted
and resid
generic methods to the data frame?
I know I can do:
A <- X[X$g == "A",]
mA <- with(A, lm(y ~ x))
A$fit <- fitted(mA)
A$res <- resid(mA)
B <- X[X$g == "B",]
mB <- with(B, lm(y ~ x))
B$fit <- fitted(mB)
B$res <- resid(mB)
C <- X[X$g == "C",]
mC <- with(B, lm(y ~ x))
C$fit <- fitted(mC)
C$res <- resid(mC)
And then rbind(A, B, C)
. However, in real life I am not using lm
(I'm using rqss
in the quantreg
package). The method occasionally fails, so I need error handling, where I'd like to place NA
all the rows that failed. Also, there are way more than 3 groups, so I don't want to just keep copying and pasting code for each group.
I tried using dplyr
with do
but didn't make any progress. I was thinking it might be something like:
make_qfits <- function(data) {
data %>%
group_by(g) %>%
do(failwith(NULL, rqss), formula = y ~ qss(x, lambda = 3))
}
Would this be easy to do by that approach? Is there another way in base R?
For the
lm
models you could tryThis gives you the residual and fitted data in ".resid" and ".fitted" columns as well as a bunch of other fit data. By default the rownames will be prefixed with the letters from
g
.With the
rqss
models that might failYou can use
do
on grouped data for this task, fitting the model in each group indo
and putting the model residuals and fitted values into adata.frame
. To add these to the original data, just include the.
that represents the data going intodo
in the outputdata.frame
.In your simple case, this would look like this:
Things will look more complicated if you need to catch errors. Here is what it would look like using
try
and filling the residuals and fitted columns withNA
if fit attempt for the group results in an error.Here's a version that works with base R: