Neo4j CSV import being too slow

2019-09-11 21:33发布

问题:

I know this question has been asked several times but none of the answers solved my problem. I am using the following query to import the data:

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM
'file:///C:/Users/Zo5/Documents/Neo4j/check/import/result1.csv' AS line1
MERGE (p:Person {forename:line1.forename, surname:line1.surname,     nationality:line1.nationality, occupation:line1.occupation, title:line1.title})

but the process is too slow. The CSV file is about 700MB. It takes about 15 minutes for 0.01 GB to be imported. I have tried the same query on a new database and the process is a lot faster. Does anyone know what might cause this problem? Note that I have index on forename.

回答1:

What are the properties that uniquely identify a person? Use THOSE properties for the MERGE, then use ON CREATE SET for the remaining properties.

As it is now your query, for each MERGE, it will compare the :Person with all your given properties to the existing set of :Persons to see if they already exist. By narrowing down the properties used in your MERGE, you will have less to compare, though the comparisons will still happen and your inserts will get steadily slower.

If you know that the :Persons you are adding do not already exist, then use CREATE instead of MERGE.