How can I generate all the possible combinations o

2019-02-25 21:10发布

问题:

I have a vector, say A,B,C,D,E and I am interested in generating all the possible combination of the vector elements. The desired output is given below.

B-A,C-A,D-A,E-A,C-B,D-B,E-B,D-C,E-C,E-D

回答1:

Try

combn(v1, 2, FUN=function(x) paste(rev(x), collapse="-"))
#[1] "B-A" "C-A" "D-A" "E-A" "C-B" "D-B" "E-B" "D-C" "E-C" "E-D"

If you want in the default order

combn(v1, 2, FUN=paste, collapse="-")
#[1] "A-B" "A-C" "A-D" "A-E" "B-C" "B-D" "B-E" "C-D" "C-E" "D-E"

Update

For a faster option, you can use combnPrim from grBase. Check here

library(grBase) 
apply(combnPrim(v1,2), 2, FUN=paste, collapse='-')
#[1] "A-B" "A-C" "B-C" "A-D" "B-D" "C-D" "A-E" "B-E" "C-E" "D-E"

data

v1 <- LETTERS[1:5]


回答2:

combn is the classic way to go. But it is generally slow ( it is entirely written in R). Here another option that I believe much faster:

grep('(.)-\\1',unique(as.vector(outer(v1,v1,FUN=paste,sep='-'))),
     value=TRUE,invert=TRUE)

In case you have more than one characters, the regex becomes:

  (.*)-\\1
  1. I used the vectorized outer to create all combinations
  2. Then I removed duplicated characters using some regular expressions.

This gives all combinations and its reverse orders:

edit some benchamrks:

v1 <- LETTERS
fun_agstudy <- function()
grep('(.)-\\1',unique(as.vector(outer(v1,v1,FUN=paste,sep='-'))),
     value=TRUE,invert=TRUE)
fun_akrun <- function()
combn(v1, 2, FUN=function(x) paste(rev(x), collapse="-"))
library(microbenchmark)
microbenchmark(fun_agstudy(),fun_akrun())

          expr      min       lq     mean   median       uq      max neval
1 fun_agstudy()  692.149  707.509 1082.153  774.259  956.114 25818.46   100
2   fun_akrun() 6597.217 6681.982 8076.223 6765.040 9020.306 36364.15   100

edit2 using bioclite:

outer still a little bit faster

summary(microbenchmark(fun_agstudy(),fun_akrun(),fun_combnPrim()))
             expr      min        lq      mean    median       uq       max neval
1   fun_agstudy()  700.114  735.3845  801.3568  786.2055  810.288  2572.896   100
2     fun_akrun() 6830.082 6929.2590 9030.9779 7040.5720 7498.717 62051.764   100
3 fun_combnPrim() 1928.534 1985.9925 2407.1335 2035.8655 2176.191 28514.237   100


标签: r vector