why allshortestpath so slow?

2019-08-25 10:12发布

问题:

I create some graph database with python and neo4j library. Graph have 50k nodes and 100k relationships.

How creating nodes:

CREATE (user:user {task_id: %s, id: %s, root: 1, private: 0})

How creating relationships:

 MATCH (root_user), (friend_user) WHERE root_user.id = %s
                                  AND root_user.task_id = %s  
                                  AND friend_user.id = %s
                                  AND friend_user.task_id = %s
                    CREATE (root_user)-[r: FRIEND_OF]->(friend_user) RETURN root_user, friend_user 

How i search all path between nodes:

MATCH (start_user:user {id: %s, task_id: %s}), 
      (end_user:user {id: %s, task_id: %s}), 
      path = allShortestPaths((start_user)-[*..3]-(end_user)) RETURN path

Soo its very slow, around 30-60 min on 50k graph. And i cant understand why. I try to create index like this:

CREATE INDEX ON :user(id, task_id)

but its not help. Can you help me? Thanks.

回答1:

You should never generate a long Cypher query that contains N slight variations of essentially the same Cypher code. That is very slow and takes up a lot of memory.

Instead, you should be passing parameters to a much simpler Cypher query.

For example, when creating your nodes, you could pass a data parameter to the following Cypher code:

UNWIND $data AS d
CREATE (user:user {task_id: d.taskId, id: d.id, root: 1, private: 0})

The data parameter value that you pass would be a list of maps, and each map would contain a taskId and id. The UNWIND clause "unwinds" the data list into individual d maps. This would be much faster.

Something similar needs to be done with your relationship-creation code.

In addition, in order to use any of your :user indexes, your MATCH clause MUST specify the :user label in the relevant node patterns. Otherwise, you are asking Cypher to scan all nodes, regardless of label, and that kind of processing would not be able to take advantage of indexes. For example, the relevant query should start with:

MATCH (root_user:user), (friend_user:user)
...