Using table() in dplyr chain

2019-02-25 09:50发布

问题:

Can someone explain why table()doesn't work inside a chain of dplyr-magrittr piped operations? Here's a simple reprex:

tibble(
  type = c("Fast", "Slow", "Fast", "Fast", "Slow"),
  colour = c("Blue", "Blue", "Red", "Red", "Red")
) %>% table(.$type, .$colour)

Error in sort.list(y) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list?

But this works of course:

df <- tibble(
  type = c("Fast", "Slow", "Fast", "Fast", "Slow"),
  colour = c("Blue", "Blue", "Red", "Red", "Red")
) 

table(df$type, df$colour)


       Blue Red
  Fast    1   2
  Slow    1   1

回答1:

This behavior is by design: https://github.com/tidyverse/magrittr/blob/00a1fe3305a4914d7c9714fba78fd5f03f70f51e/README.md#re-using-the-placeholder-for-attributes

Since you don't have a . on it's own, the tibble is still being passed as the first parameter so it's really more like

... %>% table(., .$type, .$colour)

The official magrittr work-around is to use curly braces

... %>% {table(.$type, .$colour)}


回答2:

The %>% operator in dplyr is actually imported from magrittr. With magrittr, we can also use the %$% operator, which exposes the names from the previous expression:

library(tidyverse)
library(magrittr)

tibble(
  type = c("Fast", "Slow", "Fast", "Fast", "Slow"),
  colour = c("Blue", "Blue", "Red", "Red", "Red")
) %$% table(type, colour)

Output:

      colour
type   Blue Red
  Fast    1   2
  Slow    1   1


回答3:

I've taken to using with(table(...)) like this:

tibble(type = c("Fast", "Slow", "Fast", "Fast", "Slow"),
       colour = c("Blue", "Blue", "Red", "Red", "Red")) %>% 
  with(table(type, colour))

And similar to the way we might read %>% as "and then" I would read that as "and then with that data make this table".