This question already has an answer here:
-
Faster ways to calculate frequencies and cast from long to wide
3 answers
-
How do I get a contingency table?
6 answers
This may seem like a very basic R question, but I\'d appreciate an answer. I have a data frame in the form of:
col1 col2
a g
a h
a g
b i
b g
b h
c i
I want to transform it into counts, so the outcome would be like this. I\'ve tried using table () function, but seem to only be able to get the count for one column.
a b c
g 2 1 0
h 1 1 0
i 0 1 1
How do I do it in R?
I\'m not really sure what you used, but table
works fine for me!
Here\'s a minimal reproducible example:
df <- structure(list(V1 = c(\"a\", \"a\", \"a\", \"b\", \"b\", \"b\", \"c\"),
V2 = c(\"g\", \"h\", \"g\", \"i\", \"g\", \"h\", \"i\")),
.Names = c(\"V1\", \"V2\"), class = \"data.frame\",
row.names = c(NA, -7L))
table(df)
# V2
# V1 g h i
# a 2 1 0
# b 1 1 1
# c 0 0 1
Notes:
- Try
table(df[c(2, 1)])
(or table(df$V2, df$V1)
) to swap the rows and columns.
- Use
as.data.frame.matrix(table(df))
to get a data.frame
as your output. (as.data.frame
will create a long data.frame
, not one in the same output format you desire).
Using f
from @Ananda you can use dcast
library(reshape2)
> dcast(f, V1~V2)
Using V2 as value column: use value.var to override.
Aggregation function missing: defaulting to length
V1 g h i
1 a 2 1 0
2 b 1 1 1
3 c 0 0 1
However, I\'m writing this only in case you may need something more than just table
(which for this case it\'s the simplest correct answer) in the future, like:
set.seed(1)
f$var <- rnorm(7)
> f
V1 V2 var
1 a g -0.6264538
2 a h 0.1836433
3 a g -0.8356286
4 b i 1.5952808
5 b g 0.3295078
6 b h -0.8204684
7 c i 0.4874291
> dcast(f, V1~V2, value.var=\"var\", fun.aggregate=sum)
V1 g h i
1 a -1.4620824 0.1836433 0.0000000
2 b 0.3295078 -0.8204684 1.5952808
3 c 0.0000000 0.0000000 0.4874291