So obviously, a straight forward way to find an edge between two vertices is to:
graph.traversal().V(outVertex).bothE(edgeLabel).filter(__.otherV().is(inVertex))
I feel that filter
step will have to iterate through all edges making really slow for some applications with a lot of edges.
Another way could be:
traversal = graph.traversal()
.V(outVertex)
.bothE(edgeLabel)
.as("x")
.otherV()
.is(outVertex) // uses index?
.select("x");
I'm assuming the second approach could be much faster since it will be using ID index which will make it faster than the first the approach.
Which one is faster and more efficient (in terms of IO)?
I'm using Titan, so you could also make your answer Titan specific.
Edit
In terms of time, seems like the first approach is faster (edges were 20k for vertex b
gremlin> clock(100000){g.V(b).bothE().filter(otherV().is(a))}
==>0.0016451789999999999
gremlin> clock(100000){g.V(b).bothE().as("x").otherV().is(a).select("x")}
==>0.0018231140399999999
How about IO?
In case one does not know the vertex Id's, another solution might be
This is also only unidirectional so one needs to reformulate the query for the opposite direction.
I would expect the first query to be faster. However, few things:
clock()
, be sure toiterate()
your traversals, otherwise you'll only measure how long it takes to do nothing.These are the queries I would use to find an edge in both directions:
If you expect to get at most one edge:
This way you get rid of path calculations. How those queries perform will pretty much depend on the underlying graph database. Titan's query optimizer recognizes that query pattern and should return a result in almost no time.
Now if you want to measure the runtime, do this: