Gremlin: What's an efficient way of finding an

So obviously, a straight forward way to find an edge between two vertices is to:

graph.traversal().V(outVertex).bothE(edgeLabel).filter(__.otherV().is(inVertex))

I feel that filter step will have to iterate through all edges making really slow for some applications with a lot of edges.

Another way could be:

traversal = graph.traversal()
                  .V(outVertex)
                  .bothE(edgeLabel)
                  .as("x")
                  .otherV()
                  .is(outVertex) // uses index?
                  .select("x");

I'm assuming the second approach could be much faster since it will be using ID index which will make it faster than the first the approach.

Which one is faster and more efficient (in terms of IO)?

I'm using Titan, so you could also make your answer Titan specific.

Edit

In terms of time, seems like the first approach is faster (edges were 20k for vertex b

gremlin> clock(100000){g.V(b).bothE().filter(otherV().is(a))}
==>0.0016451789999999999
gremlin> clock(100000){g.V(b).bothE().as("x").otherV().is(a).select("x")}
==>0.0018231140399999999

How about IO?

标签： titan gremlin tinkerpop3

2条回答

The star\"

2楼-- · 2019-02-19 04:14

In case one does not know the vertex Id's, another solution might be

g.V().has('propertykey','value1').outE('thatlabel').as('e').inV().has('propertykey','value2').select('e')

This is also only unidirectional so one needs to reformulate the query for the opposite direction.

0人赞添加讨论(0) 举报

贪生不怕死

3楼-- · 2019-02-19 04:18

I would expect the first query to be faster. However, few things:

None of the queries is optimal, since both of them enable path calculations. If you need to find a connection in both directions, then use 2 queries (I will give an example below)
When you use clock(), be sure to iterate() your traversals, otherwise you'll only measure how long it takes to do nothing.

These are the queries I would use to find an edge in both directions:

g.V(a).outE(edgeLabel).filter(inV().is(b))
g.V(b).outE(edgeLabel).filter(inV().is(a))

If you expect to get at most one edge:

edge = g.V(a).outE(edgeLabel).filter(inV().is(b)).tryNext().orElseGet {
       g.V(b).outE(edgeLabel).filter(inV().is(a)).tryNext()
}

This way you get rid of path calculations. How those queries perform will pretty much depend on the underlying graph database. Titan's query optimizer recognizes that query pattern and should return a result in almost no time.

Now if you want to measure the runtime, do this:

clock(100) {
  g.V(a).outE(edgeLabel).filter(inV().is(b)).iterate()
  g.V(b).outE(edgeLabel).filter(inV().is(a)).iterate()
}

0人赞添加讨论(0) 举报

Gremlin: What's an efficient way of finding an

Edit

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间