neo4j merge 2 or multiple duplicate nodes

2019-06-21 03:39发布

问题:

I am feeding my neo4j db manually using cypher, so prone to error like creating duplicate nodes:

The duplicate nodes will have each relationships to other nodes. Is there a built-in function to merge these nodes? Or should I do it manually?

Sounds possible, but complicated with cypher script:

    1. Get the relationships of each duplicate node
    1. Recreate them (with their properties) with the correct node (given node id)
    1. Remove relationships to the duplicate nodes
    1. and finally remove the duplicate nodes.

回答1:

To avoid this situation in the future, please look at the MERGE keyword in Cypher. Unfortunately, as far as I know, there is nothing in Cypher (yet) like:

MATCH (n:MyNode),(m:MyNode)
WHERE ID(n) <> ID(m) AND
PROPS(n) IN PROPS(m) AND PROPS(m) IN PROPS(n)
(...) DELETE (...)

The fictional function PROPS of the third line is not part of Cypher language and User-Defined functions have not made it yet into Neo4j.

If you're not working with production instances, the easiest is probably to back up your data folder and try to start the insertion over (with MERGE).

Otherwise, you can also try writing a traversal to collect the duplicates and delete them in batch (here is an example with the REST API).



回答2:

Try this:

MATCH (n:MyNode),(m:MyNode),(o:OtherNode {id:123})
WHERE n <> m
MATCH (m)-[r:FOO]->()
CREATE (n)-[r2:FOO]->(o)
SET r2 = r
DELETE r,m


回答3:

I think you can try:

apoc.refactor.mergeNodes(nodes, options)

For relations:

apoc.refactor.mergeRelationships(rels, options)

Or:

apoc.periodic.iterate(query, options)


标签: neo4j cypher