In R, I would like to fit a gam model with categorical variables. I thought I could do it like with (cat is the categorical variable).
lm(data = df, formula = y ~ x1*cat + x2 + x3);
But I can't do things like :
gam(data = df, formula = y ~ s(x1)*cat + s(x2) + x3)
but the following works:
gam(data = df, formula = y ~ cat + s(x1) + s(x2) + x3)
How do I add a categorical variable to just one of the splines?
One of the comments has more or less told you how. Use by
variable:
s(x1, by = cat)
This creates the "factor smooth" smoothing class fs
, where a smooth function of x1
is created for each factor level. Smoothing parameters are also duplicated but not linked, so they are estimated indecently. You can set
s(x1, by = cat, id = 0)
to use a single smoothing parameter for all "sub smooths".
Also note that contrast does not apply to factor but smooth function is still subject to centering constraint. What this means is that you need to specify factor variable as a fixed effect, too:
s(x1, by = cat) + cat