I have the following dataframe
one <- c('one',NA,NA,NA,NA,'two',NA,NA)
group1 <- c('A','A','A','A','B','B','B','B')
group2 <- c('C','C','C','D','E','E','F','F')
df = data.frame(one, group1,group2)
> df
one group1 group2
1 one A C
2 <NA> A C
3 <NA> A C
4 <NA> A D
5 <NA> B E
6 two B E
7 <NA> B F
8 <NA> B F
I want to get the count of non-missing observations of one
for each combination of group1
and group2
.
In Pandas, I would use groupby(['group1','group2']).transform
, but how can I do that in R? The original dataframe is LARGE.
Expected output is:
> df
one group1 group2 count
1 one A C 1
2 <NA> A C 1
3 <NA> A C 1
4 <NA> A D 0
5 <NA> B E 1
6 two B E 1
7 <NA> B F 0
8 <NA> B F 0
Many thanks!
Let's not forget that a lot of things can be done in
base
R, although sometimes not as efficiently asdata.table
ordplyr
:with
data.table
:gives:
The idea is to sum the true values (1 once converted to integer) where B is not
NA
while grouping bygroup1
andgroup2
.