I have two factors. factor A have 2 level, factor B have 3 level.
How to create the following design matrix?
factorA1 factorA2 factorB1 factorB2 factorB3
[1,] 1 0 1 0 0
[2,] 1 0 0 1 0
[3,] 1 0 0 0 1
[4,] 0 1 1 0 0
[5,] 0 1 0 1 0
[6,] 0 1 0 0 1
You have a couple of options:
Use base and piece it together yourself:
Or use the ade4 package as follows:
Things get complicated with
model.matrix()
again if we add a covariatex
and interactions ofx
with factors.So
mm
has an intercept, but nowA:x
interaction terms have an unwanted levelA0:x
If we reintroduce x as as a separate term, we will cancel that unwanted levelWe can get rid of the unwanted intercept and the unwanted bare
x
termExpanding and generalizing @Ferdinand.kraft's answer:
Model matrix only allows what it calls "dummy" coding for the first factor in a formula. If the intercept is present, it plays that role. To get the desired effect of a redundant index matrix (where you have a 1 in every column for the corresponding factor level and 0 elsewhere), you can lie to
model.matrix()
and pretend there's an extra level. Then trim off the intercept column.Assuming your data in in a data.frame called
dat
, let's say the two factors are given as in this example:You can use
outer
to get a matrix as you showed, for each factor:Now do the same for the second factor:
And
cbind
them to get the final result:model.matrix
is the process thatlm
and others use in the background to convert for you.It creates the INTERCEPT variable as a column of 1's, but you can easily remove that if you need.
Edit: Now i see that this is essentially the same as one of the other comments, but more concise.