I have a data frame with grouped variable and I want to sum them by group. It's easy with dplyr
.
library(dplyr)
library(magrittr)
data <- data.frame(group = c("a", "a", "b", "c", "c"), n1 = 1:5, n2 = 2:6)
data %>% group_by(group) %>%
summarise_all(sum)
# A tibble: 3 x 3
group n1 n2
<fctr> <int> <int>
1 a 3 5
2 b 3 4
3 c 9 11
But now I want a new column total
with the sum of n1
and n2
by group. Like this:
# A tibble: 3 x 3
group n1 n2 ttl
<fctr> <int> <int> <int>
1 a 3 5 8
2 b 3 4 7
3 c 9 11 20
How can I do that with dplyr
?
EDIT: Actually, it's just an example, I have a lot of variables.
I tried these two codes but it's not in the right dimension...
data %>% group_by(group) %>%
summarise_all(sum) %>%
summarise_if(is.numeric, sum)
data %>% group_by(group) %>%
summarise_all(sum) %>%
mutate_if(is.numeric, .funs = sum)
We can use
data.table
. Convert the 'data.frame' to 'data.table' (setDT(data)
), grouped by 'group', get thesum
of each columns in the Subset of data.table, and then withReduce
, get thesum
of the rows of the columns of interestOr with
base R
Or with
dplyr
Base R
We can use
apply
together with thedplyr
functions.Or
rowSums
with the same strategy. The key is to use.
to specify the data frame and[]
withx:ncol(.)
to keep the columns you want.You can use
mutate
aftersummarize
:If need to sum all numeric columns, you can use
rowSums
withselect_if
(to select numeric columns) to sum columns up: