Applying a function over consecutive pairs of list

2019-07-04 23:03发布

问题:

I am trying to find an efficient (i.e. avoid using loops) way to apply a function that iteratively takes as arguments the current and previous (or next) elements of a list and returns a lists of the result (the length of which will necessarily be 1 element shorter). As a concrete example,

I have a list of vertices defining a path in some graph

vlist <- c(1,2,7,12,17)

which come from a lattice graph constructed using the igraph function "lattice"

G <- graph.lattice(c(5,7))

I want to apply the function "get.edge.ids" over vlist so that the list returned yields the ids of the edges connecting the consecutive elements in vlist. E.g. I want the ids of edges 1-->2, 2-->7, 7-->12, 12-->17

This is trivial using a for loop,

    findEids <- function(G,vlist) {
        outlist=c()
        for (i in 1:(length(vlist)-1) {
            outlist=append(outlist,get.edge.ids(G,c(vlist[i],vlist[i+1])))
        }
        return(outlist)
    }

but I would like to use a vectorized approach like apply() or reduce() to see if I can get it to work more quickly since I will need to call functions like this repeatedly from a script (for example, to compute the total stretch for a spanning tree of G).

回答1:

I use mapply for that. For example

a<-1:1000
mapply(function(x,y)x-y,a[-1000],a[-1])

It appears to be slightly faster than the for loop version:

> f <- function(x,y)x-y
> g <- function(){
     o<-c();
     for(i in a[-1000])o<-c(o,f(i,i+1))
> }


>
> system.time( 
+     for(i in 1:1000){
+         mapply(f,a[-1000],a[-1])
+     }
+ )
   user  system elapsed 
  2.344   0.000   2.345 


> system.time(for(i in 1:1000)g())
   user  system elapsed 
  3.399   0.000   3.425 


回答2:

This might work for you:

library(zoo)

findEids <- function(gr, v.list) {
  rollapply(v.list, width=2, FUN=function(x) {
    get.edge.ids(gr, x)
  })
}

findEids(G, vlist)
## [1]  1  4 13 22


回答3:

Well, actually, for this specific question, you can query the whole path at once with

as.vector(E(G, path=vlist))
# [1]  1  4 13 22

This is very readable, and seems to be faster than any other solution, although speed probably only matters if you have long paths.

v2 <- c(1,2,7,12,17,12,7,2)
vlist <- rep(v2, 100000)

system.time(get.edge.ids(G, vlist[c(1, rep(2:(length(vlist) - 1), each = 2), 
                                  length(vlist))]))
#   user  system elapsed 
#  0.218   0.014   0.232 

system.time(as.vector(E(G, path=vlist)))
#   user  system elapsed 
#  0.028   0.007   0.035 


回答4:

While this is not direct answer to the question in the subject but more specific to your request

If you look at description of argument vp in function get.edge.ids you will se that

vp
The indicent vertices, given as vertex ids or symbolic vertex names. They are interpreted pairwise, i.e. the first and second are used for the first edge, the third and fourth for the second, etc.

So in this case all you need is that you create a new vector from vlist such that all elements except first and last, are repeated twice. You can do that by using vlist[c(1, rep(2:(length(vlist)-1), each = 2), length(vlist))]

c(1, rep(2:(length(vlist) - 1), each = 2), length(vlist))
## [1] 1 2 2 3 3 4 4 5
vlist[c(1, rep(2:(length(vlist) - 1), each = 2), length(vlist))]
## [1]  1  2  2  7  7 12 12 17


get.edge.ids(G, vlist[c(1, rep(2:(length(vlist) - 1), each = 2), length(vlist))])
## [1]  1  4 13 22


回答5:

I recently learned to use dplyr, which can solve this via mutate / transmute and paste:

data.frame(x=vlist) %>% 
mutate(y=lead(x)) %>%
transmute(edge=paste(x,y,sep="-->")

which yields

     edge
1   1-->2
2   2-->7
3  7-->12
4 12-->17
5 17-->NA