如何装入一个大的CSV文件导入的Neo4j(How to load a large csv file

2019-09-27 18:21发布

我试图加载一个大的CSV文件(1458644行 )到Neo4j的,但我仍然收到此错误:

Neo.TransientError.General.OutOfMemoryError: There is not enough memory to perform the current task. Please try increasing 'dbms.memory.heap.max_size' in the neo4j configuration (normally in 'conf/neo4j.conf' or, if you you are using Neo4j Desktop, found through the user interface) or if you are running an embedded installation increase the heap by using '-Xmx' command line flag, and then restart the database.

即使我改变dbms.memory.heap.max_size=1024m与M = megbite,再次出现同样的错误!

注:CSV的大小为195.888 KB

这是我的代码:

load csv with headers from "file:///train.csv" as line
create(pl:pickup_location{latitude:toFloat(line.pickup_latitude),longitude:toFloat(line.pickup_longitude)}),(pt:pickup_time{pickup:line.pickup_datetime}),(dl:dropoff_location{latitude:toFloat(line.dropoff_latitude),longitude:toFloat(line.dropoff_longitude)}),(dt:dropoff_time{dropoff:line.dropoff_datetime})
create (pl)-[:TLR]->(pt),(dl)-[:TLR]->(dt),(pl)-[:Trip]->(dl);

我该怎么办 ?

Answer 1:

You should use periodic commits to process the CSV data in batches. For example, this will process 10,000 lines at a time (the default batch size is 1000):

USING PERIOD COMMIT 10000
LOAD CSV WITH HEADERS FROM "file:///train.csv" as line
CREATE (pl:pickup_location{latitude:toFloat(line.pickup_latitude),longitude:toFloat(line.pickup_longitude)}),(pt:pickup_time{pickup:line.pickup_datetime}),(dl:dropoff_location{latitude:toFloat(line.dropoff_latitude),longitude:toFloat(line.dropoff_longitude)}),(dt:dropoff_time{dropoff:line.dropoff_datetime})
CREATE (pl)-[:TLR]->(pt),(dl)-[:TLR]->(dt),(pl)-[:Trip]->(dl);


Answer 2:

我通过复制限排,解决问题解决了这里

所以这是我的解决方案:

USING PERIODIC COMMIT
load csv with headers from "file:///train.csv" as line
with line LIMIT 1458644
create   (pl:pickup_location{latitude:toFloat(line.pickup_latitude),longitude:toFloat(line.pickup_longitude)}),(pt:pickup_time{pickup:line.pickup_datetime}),(dl:dropoff_location{latitude:toFloat(line.dropoff_latitude),longitude:toFloat(line.dropoff_longitude)}),(dt:dropoff_time{dropoff:line.dropoff_datetime})
create (pl)-[:TLR]->(pt),(dl)-[:TLR]->(dt),(pl)-[:Trip]->(dl);

该解决方案的缺点是,你需要知道的(Excel无法打开大的CSV文件)你的大CSV文件的行数。



文章来源: How to load a large csv file into Neo4j
标签: neo4j cypher