可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

How do I tell group_by to group the data by all columns except a given one?

With aggregate, it would be aggregate(x ~ ., ...).

I tried group_by(data, -x), but that groups by the negative-of-x (i.e. the same as grouping by x).

回答1:

You can do this using standard evaluation (group_by_ instead of group_by):

# Fake data
set.seed(492)
dat = data.frame(value=rnorm(1000), g1=sample(LETTERS,1000,replace=TRUE),
                 g2=sample(letters,1000,replace=TRUE), g3=sample(1:10, replace=TRUE),
                 other=sample(c("red","green","black"),1000,replace=TRUE))

dat %>% group_by_(.dots=names(dat)[-grep("value", names(dat))]) %>%
  summarise(meanValue=mean(value))

       g1     g2    g3  other   meanValue
   <fctr> <fctr> <int> <fctr>       <dbl>
1       A      a     2  green  0.89281475
2       A      b     2    red -0.03558775
3       A      b     5  black -1.79184218
4       A      c    10  black  0.17518610
5       A      e     5  black  0.25830392
...

See this vignette for more on standard vs. non-standard evaluation in dplyr.

UPDATE for `dplyr` 0.7.0

To address @ÖmerAn's comment: It looks like group_by_at is the way to go in dplyr 0.7.0 (someone please correct me if I'm wrong about this). For example:

dat %>% 
  group_by_at(setdiff(names(dat), "value")) %>%
  summarise(meanValue=mean(value))

# Groups:   g1, g2, g3 [?]
       g1     g2    g3  other   meanValue
   <fctr> <fctr> <int> <fctr>       <dbl>
 1      A      a     2  green  0.89281475
 2      A      b     2    red -0.03558775
 3      A      b     5  black -1.79184218
 4      A      c    10  black  0.17518610
 5      A      e     5  black  0.25830392
 6      A      e     5    red -0.81879788
 7      A      e     7  green  0.30836054
 8      A      f     2  green  0.05537047
 9      A      g     1  black  1.00156405
10      A      g    10  black  1.26884303
# ... with 949 more rows

Let's confirm both methods give the same output (in dplyr 0.7.0):

new = dat %>% 
  group_by_at(setdiff(names(dat), "value")) %>%
  summarise(meanValue=mean(value))

old = dat %>% 
  group_by_(.dots=names(dat)[-grep("value", names(dat))]) %>%
  summarise(meanValue=mean(value))

identical(old, new)
# [1] TRUE

回答2:

Building on the @eipi10's dplyr 0.7.0 edit, group_by_at appears to be the right function for this job. However, if you are simply looking to omit column "x", then you can use:

new2.0 <- dat %>%
  group_by_at(vars(-x)) %>%
  summarize(mean_value = mean(value))

Using @eipi10's example data:

# Fake data
set.seed(492)
dat <- data.frame(value = rnorm(1000),
             g1 = sample(LETTERS, 1000, replace = TRUE),
             g2 = sample(letters, 1000, replace = TRUE),
             g3 = sample(1:10, replace = TRUE),
             other = sample(c("red", "green", "black"), 1000, replace = TRUE))

new <- dat %>% 
  group_by_at(names(dat)[-grep("value", names(dat))]) %>%
  summarise(meanValue = mean(value))


new2.0 <- dat %>% 
  group_by_at(vars(-value)) %>% 
  summarize(meanValue = mean(value))

identical(new, new2.0)
# [1] TRUE

回答3:

A small update on this question because I stumbled across this myself and found an elegant solution with current version of dplyr (0.7.4): Inside group_by_at(), you can supply the names of columns the same way as in the select() function using vars(). This enables us to group by everything but one column (hp in this example) by writing:

library(dplyr)
df <- as_tibble(mtcars, rownames = "car")
df %>% group_by_at(vars(-hp))

How to group by all but one columns?

问题:

回答1:

UPDATE for `dplyr` 0.7.0

回答2:

回答3:

收藏的人(0)

How to group by all but one columns?

问题:

回答1:

UPDATE for dplyr 0.7.0

回答2:

回答3:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

UPDATE for `dplyr` 0.7.0