Neo4j shortest path (BFS) distances query variants

I do not know Neo4j so, bear with me. I have a big (1M nodes) undirected, unweighted graph. Assume I magically import this graph to Neo4j. Can the Neo4j query engine (cypher) support those the following types of queries?

Range queries. Bring all me nodes (and their distances) that are within 3 hops distance from a specific node
Bring the shortest-path (BFS) distance (since the graph is undirected and unweighted) between a specific node and a set of nodes.
Bring the shortest-path (BFS) distance between a specific node and all other graph nodes.

If those types of queries are actually possible, without a recursive type implementation but straight from Cypher, What type of performance should I expect (few seconds, many seconds or minutes)?

标签： neo4j cypher

1条回答

Root（大扎）

2楼-- · 2019-05-27 06:05

Yes, cypher can do all of those things.

Range queries look something like this:

MATCH path=(a:MyNode { name: "Foo"})-[:myRelationshipType*1..20]->(b));

That gives you all "b" nodes that are between 1 and 20 hops from MyNode with the given relationship type. The matched path is a variable, which can have various cypher functions applied to it. So for each path, you can ask how long it is, what's in the middle, and so on. The cypher refcard shows functions that apply to paths to get a sense for what you can do with them.

Shortest-path searches can be found here and use the shortestPath and allShortestPaths functions in cypher.

If you wanted to get the shortest path from something to basically everything else in the graph, you could do that in one query; the shortest path would start with matching a "head" node, and a "tail" node. In the case of finding shortest path from one thing to everything else, the head node match would be your node of interest, and the tail node match would be "any node in the graph". E.g. MATCH (a:MyNodeOfInterest), (b), p=shortestPath((a)-[*]->(b)). So you could do this in one query, but, if you're trying to find shortest paths from one thing to everything else in a one million node graph, that's going to take some time, no matter what graph database you use.

In terms of performance, nobody can really accurately answer that question. It will depend on a lot of different factors like:

Total data volumes
Indexing strategy
Your use/abuse of node labels, and relationship types
Total path lengths
JVM/memory/cache configurations.

0人赞添加讨论(0) 举报

Neo4j shortest path (BFS) distances query variants

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间