Merge existing records in neo4j, remove duplicates

2019-02-15 18:56发布

I've imported my millions of records using CREATE for performance reasons, now I want to MERGE the records together, and keep all the relationships intact.

Any ideas?

EDIT:

MATCH (c1:company), (c2:company) 
WITH c1, c2 
WHERE c1.name = c2.name 
SET c1=c2

Is the type of thing I'm looking for.

2条回答
虎瘦雄心在
2楼-- · 2019-02-15 19:22

If you want to merge nodes in cypher you can do something like this:

MATCH (c:Company)
WITH c.name as name, collect(c) as companies, count(*) as cnt
WHERE cnt > 1
WITH head(companies) as first, tail(companies) as rest
LIMIT 1000
UNWIND rest AS to_delete
MATCH (to_delete)<-[r:WORKS_AT]-(e:Employee)
MERGE (first)<-[:WORKS_AT]-(e)
DELETE r
DELETE to_delete
RETURN count(*);

see: http://www.neo4j.org/graphgist?dropbox-14493611%2Fmerge_nodes.adoc

查看更多
叼着烟拽天下
3楼-- · 2019-02-15 19:30

It doesn't work that way. There is no way to move relationships around, and no way to coalesce existing nodes. You should use MERGE from the beginning, along with constraints and indexes to aid performance.

查看更多
登录 后发表回答