neo4j import slowing down

2019-04-11 08:26发布

I'm trying to import a medium data set of about 500,000 nodes into neo4j using cypher. I am running neo4j-community-2.0.0-M05 locally on my 3.4 GHz i7 iMac with SSD.

I am piping the cypher to neo4j shell, wrapping every 40k lines into a transaction.

I am using labels and before I started, I created indices on one property per labeled node.

When I left last night, MATCH CREATE UNIQUE were taking about 15ms each. This morning they are taking about 6000ms.

The slow queries looks something like this

MATCH n:Artifact WHERE n.pathId = 'ZZZ' CREATE UNIQUE n-[r:DEPENDS_ON]->(a:Artifact {pathId: 'YYY'}) RETURN a
1 row
5719 ms

pathId is indexed.

I understand this is a milestone build and probably not performance optimized. But I'm less than a third of the way through my import and it's slowing down more and more.

Should I look at some other methods than cypher to import this data?

标签： neo4j cypher

1条回答

We Are One

2楼-- · 2019-04-11 09:05

I just want to answer my own question in case someone else finds this. Thanks to Peter for suggesting the batch import project. I used the 2.0 tree.

My workflow ended up being to (1) load all the data into a relational database, (2) clean up duplicates, and then (3) write a script to export the data into CSV files.

Using cypher, I had the import running for 24 hours before I killed it. Using the java import tool, the entire import took 11 seconds with neo4j-community-2.0.0-M06.

Bottom line: don't bother trying to write out cypher to import large chunks of data. Spend an hour cleaning up your data if necessary, then export to CSV and use the java batch import tool.

0人赞添加讨论(0) 举报

neo4j import slowing down

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间