There are a couple of issues about this on the dplyr Github repo already, and at least one related SO question, but none of them quite covers my question -- I think.
- Adding multiple columns in a dplyr mutate call is more or less what I want, but there's a special-case answer for that case (
tidyr::separate
) that doesn't (I think) work for me. - This issue ("summarise or mutate with functions returning multiple values/columns") says "use
do()
".
Here's my use case: I want to compute exact binomial confidence intervals
dd <- data.frame(x=c(3,4),n=c(10,11))
get_binCI <- function(x,n) {
rbind(setNames(c(binom.test(x,n)$conf.int),c("lwr","upr")))
}
with(dd[1,],get_binCI(x,n))
## lwr upr
## [1,] 0.06673951 0.6524529
I can get this done with do()
but I wonder if there's a more expressive way to do this (it feels like mutate()
could have a .n
argument as is being discussed for summarise() ...)
library("dplyr")
dd %>% group_by(x,n) %>%
do(cbind(.,get_binCI(.$x,.$n)))
## Source: local data frame [2 x 4]
## Groups: x, n
##
## x n lwr upr
## 1 3 10 0.06673951 0.6524529
## 2 4 11 0.10926344 0.6920953
Yet another variant, although I think we're all splitting hairs here.
Personally, if we're just going by readability, I find this preferable:
...but now we're really splitting hairs.
Here are some possibilities with
rowwise
andnesting
.data frame with repeated x/n combinations, for fun
a versions of the CI function that returns a data frame, like @Joran's
Grouping by
x
andn
as before, removes the duplicate.Using
rowwise
keeps all the rows but removesx
andn
unless you put them back usingcbind(.
(like Ben does in his OP).It feels like nesting could work more cleanly, but this is as good as I can get. Using
mutate
means I can usex
andn
directly instead of.$x
and.$n
, but mutate expects a single value, so it needs to be wrapped inlist
.Finally, looks like something like this is an open issue (as of 5 Oct 2017) for dplyr; see https://github.com/tidyverse/dplyr/issues/2326; if something like that is implemented then that will be the easiest way!
This uses a "standard" dplyr workflow, but as @BenBolker notes in the comments, it requires calling
get_binCI
twice:Yet another option could be to use the
purrr::map
family of functions.If you replace
rbind
withdplyr::bind_rows
in theget_binCI
function:You can use
purrr::map2
withtidyr::unnest
:Or
purrr::map2_dfr
withdplyr::bind_cols
:Here's a quick solution using
data.table
package insteadFirst, a little change to the function
Then, simply