Similar to: mutate rowSums exclude one column but in my case, I really want to be able to use select
to remove a specific column or set of columns
I'm trying to understand why something of this nature, won't work.
d <- data.frame(
Alpha = letters[1:26],
Beta = rnorm(26),
Epsilon = rnorm(26),
Gamma = rnorm(26)
)
I thought this would work, but it's giving me a strange error:
# Total = Beta + Gamma
d <- mutate(d,Total = rowSums(select(d,-Epsilon,-Alpha)))
Error: All select() inputs must resolve to integer column positions.
The following do not:
* -structure(1:26, .Label = c("a", "b", "c", "d", "e", "f", "g", "h", "i...
In addition: Warning message:
In Ops.factor(1:26) : ‘-’ not meaningful for factors
I'd like to be able to do this in a long chain, and keep it "dplyr style"... it strikes me as odd that this is so difficult given that it's really straightforward without using typical dplyr syntax:
d$Total <- rowSums(select(d, -Alpha, -Epsilon)) # This works!
I'm only just learning dplyr, so perhaps it is because of version upgrades, but this does now work:
d %>% mutate(Total=rowSums(select(d,-Epsilon, -Alpha)))
These days, I usually see folks use the dot notation:
d %>% mutate(Total=rowSums(select(.,-Epsilon, -Alpha)))
A slightly more manageable example:
df2 = data.frame(A=sample(0:20,10), B=sample(0:20, 10), C=sample(0:20,10), D=LETTERS[1:10])
df2
A B C D
1 19 0 9 A
2 6 10 14 B
3 13 20 6 C
4 20 4 15 D
5 9 14 8 E
6 11 1 18 F
7 4 15 13 G
8 17 5 0 H
9 16 3 16 I
10 2 6 1 J
df2 %>% mutate(total=rowSums(select(.,-D)))
A B C D total
1 19 0 9 A 28
2 6 10 14 B 30
3 13 20 6 C 39
4 20 4 15 D 39
5 9 14 8 E 31
6 11 1 18 F 30
7 4 15 13 G 32
8 17 5 0 H 22
9 16 3 16 I 35
10 2 6 1 J 9
NOTE:
The question you linked to has an updated answer that shows yet another new method that demonstrates some new dplyr features:
df2 %>% mutate(total=rowSums(select_if(., is.numeric)))
A B C D total
1 19 0 9 A 28
2 6 10 14 B 30
3 13 20 6 C 39
4 20 4 15 D 39
5 9 14 8 E 31
6 11 1 18 F 30
7 4 15 13 G 32
8 17 5 0 H 22
9 16 3 16 I 35
10 2 6 1 J 9
@akrun provided already a relevant link about this problem. As about dplyr
solution, I would actually use do
:
d %>%
do({
.$Total <- rowSums(select(., -Epsilon, -Alpha))
.
})