I have a data.table out
like this (in reality it is much larger):
out <- code weights group
1: 2 0.387 1
2: 1 0.399 1
3: 2 1.610 1
4: 3 1.323 2
5: 2 0.373 2
6: 1 0.212 2
7: 3 0.316 3
8: 2 0.569 3
9: 1 0.120 3
10: 1 0.354 3
It has 3 groups with different codes (column 1). In group #1, the code 3 does not appear, while in the other it appears.
Then, I want to sum the weights for every group and code combination . I achieve this with this command:
sum.dt <- out[,.(sum(weights)), by=list(code,group)][order(-V1)]
This works well but it does not have the combination Group 1 with Code 3 because it is not in the out
table. I would like to have all possible combinations in sum.dt
, and if the combination does not occur in the source table, it should sum up to 0, meaning the column V1
should be 0 in this row.
Any idea how I could achieve this?
Using
CJ
(cross join) you can add the missing combinations:gives:
Or with
xtabs
as @alexis_laz showed in the comments:which gives:
If you want to get this output in a long-form dataframe, you can wrap the
xtabs
code in themelt
function of the reshape2 (or data.table) package:which gives:
You could also do this with a combination of dplyr and tidyr:
I had a similar problem, and
CJ
did not work for some reason. A relatively simple solution I ended up using is first callingdcast
and thenmelt
(similar to thextable
solution above)- this also conveniently lets you specify the fill value for the missing combinations.This gives