Limiting depth of shortest path query using Gremli

2019-05-28 21:25发布

问题:

I have a fairly large graph (currently 3806702 vertices and 7774654 edges, all edges with the same label) in JanusGraph. I am interested in shortest path searches in it. Gremlin recipes mention this query:

g.V(startId).until(hasId(targetId)).repeat(out().simplePath()).path().limit(1)

This returns path that I know to be a correct one immediately but then hangs the console (top shows janusgraph and scylla to be processing stuff furiously though, so I guess it's working in the background, but it takes forever). It does the right thing and returns first (correct) shortest path if used like this:

g.V(startId).until(hasId(targetId)).repeat(out().simplePath()).path().next()

I would like to limit this query so that gremlin/janusgraph stops searching for path over, let's say, 100 hops (so I want max depth of 100 edges basically). I have tried to use .times(100) in multiple positions but if .until() is used with .times() in the same query it always crashes with a NullPointerException in gremlin traversal classes, ie:

java.lang.NullPointerException
        at org.apache.tinkerpop.gremlin.process.traversal.util.TraversalHelper.hasStepOfAssignableClassRecursively(TraversalHelper.java:351)
        at org.apache.tinkerpop.gremlin.process.traversal.strategy.optimization.RepeatUnrollStrategy.apply(RepeatUnrollStrategy.java:61)
        at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversalStrategies.applyStrategies(DefaultTraversalStrategies.java:86)
        at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.applyStrategies(DefaultTraversal.java:119)
        at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:198)
        at java_util_Iterator$next.call(Unknown Source)
...

Does anyone have any idea how can I apply such limit? I need this to return first result or fail, fast.

Thanks!

回答1:

Add another break condition in your until() and also make sure to limit() the result before you ask for paths:

g.V(startId).
  until(__.hasId(targetId).or().loops().is(100)).
    repeat(__.both().simplePath()).
  hasId(targetId).limit(1).path()

Calling tryNext() on this traversal will give you an Optional<Path>. If it's empty, then no path was found within the given distance.