How to generate all first-order interaction terms

2019-07-20 12:40发布

问题:

Is there a way in glmnet to do first order interactions?

For instance, if my X matrix was:

V1 V2 V3
0  1   0
1  0   1
1  0   0
...

Is there a way to specify that it do something along the lines of `y~ V1 + V2 + V3 + V1*V2 + V2 *V3 + V1*V3' without manually creating the columns? My actual matrix is larger and would be a pain to create all first order cross products by hand.

回答1:

The proper R syntax for such a formula is

y~(V1+V2+V3)^2

For example

set.seed(15)
dd <- data.frame(V1=runif(50), V2=runif(50), V3=runif(50), y=runif(50))
lm(y~(V1+V2+V3)^2, dd)

Call:
lm(formula = y ~ (V1 + V2 + V3)^2, data = dd)

Coefficients:
(Intercept)           V1           V2           V3        V1:V2        V1:V3        V2:V3  
    0.54169     -0.10030     -0.01226     -0.10150      0.38521     -0.03159      0.01200 

Or, if you want to model all variables other than y,

lm(y~(.)^2, dd)

Call:
lm(formula = y ~ (.)^2, data = dd)

Coefficients:
(Intercept)           V1           V2           V3        V1:V2        V1:V3        V2:V3  
    0.54169     -0.10030     -0.01226     -0.10150      0.38521     -0.03159      0.01200 

Both are the same as

lm(y~V1+V2+V3+V1:V2+V1:V3+V2:V3, dd)

Call:
lm(formula = y ~ V1 + V2 + V3 + V1:V2 + V1:V3 + V2:V3, data = dd)

Coefficients:
(Intercept)           V1           V2           V3        V1:V2        V1:V3        V2:V3  
    0.54169     -0.10030     -0.01226     -0.10150      0.38521     -0.03159      0.01200  

You can use these formula with model.matrix to create a matrix

model.matrix(y~(V1+V2+V3)^2,dd)