I am feeding my neo4j db manually using cypher, so prone to error like creating duplicate nodes:
The duplicate nodes will have each relationships to other nodes.
Is there a built-in function to merge these nodes? Or should I do it manually?
Sounds possible, but complicated with cypher script:
- Get the relationships of each duplicate node
- Recreate them (with their properties) with the correct node (given node id)
- Remove relationships to the duplicate nodes
- and finally remove the duplicate nodes.
To avoid this situation in the future, please look at the MERGE keyword in Cypher.
Unfortunately, as far as I know, there is nothing in Cypher (yet) like:
MATCH (n:MyNode),(m:MyNode)
WHERE ID(n) <> ID(m) AND
(...) DELETE (...)
The fictional function PROPS of the third line is not part of Cypher language and User-Defined functions have not made it yet into Neo4j.
If you're not working with production instances, the easiest is probably to back up your data folder and try to start the insertion over (with MERGE).
Otherwise, you can also try writing a traversal to collect the duplicates and delete them in batch (here is an example with the REST API).
Try this:
MATCH (n:MyNode),(m:MyNode),(o:OtherNode {id:123})
WHERE n <> m
MATCH (m)-[r:FOO]->()
CREATE (n)-[r2:FOO]->(o)
SET r2 = r
I think you can try:
apoc.refactor.mergeNodes(nodes, options)
For relations:
apoc.refactor.mergeRelationships(rels, options)
apoc.periodic.iterate(query, options)