Creating neo4j graph database from csv file using

2019-06-23 23:45发布

问题:

I am currently working in a doctoral program and i am interested about Py2neo, so I am using it in order to perform some experiments using social graphs. However I got into newbie troubles. Excuse me for asking these simple questions.

I got a xml dataset containing data about publications of a jornal, I have converted it into a csv table, there are about 700 records and each record is composed by four fiels: date, title, keywords, author. So my first question is how to create a graph from this table programatically. I considered writing a python script which loops the csv table, reads for each row and columns fields and writes into nodes. +++++++++++++++++++++++++++++++++++++++++++++ Code +++++++++++++++++++++++++++++++++++++++++++

   #!/usr/bin/env python
   #
   import csv
   from py2neo import neo4j, cypher
   from py2neo import node,  rel

   # calls database service of Neo4j
   #
   graph_db = neo4j.GraphDatabaseService("http://localhost:7474/db/data/")
   #
   # Create nodes and relationships from a csv table
   # since it's a csv table, a reader must be invoked


   ifile  = open('testeout5_cp.csv', "rb")
   reader = csv.reader(ifile)

   # clear database
   graph_db.clear()

   rownum = 0
   for row in reader:
        colnum = 0
        for col in row:
            titulo, autor, rel = graph_db.create(
            {"titulo": col[1]}, {"autor": col[3]}, (1, "eh_autor_de", 0)
            )
            print(titulo,  autor)  
   rownum += 1

   ifile.close()

================ I got this output (Fragment): Python 2.7.5 (default, Aug 22 2013, 09:31:58) [GCC 4.8.1 20130603 (Red Hat 4.8.1-1)] on aires2, Standard

    (Node('http://localhost:7474/db/data/node/10392'), Node('http://localhost:7474/db/data /node/10393'))
    (Node('http://localhost:7474/db/data/node/10394'), Node('http://localhost:7474/db/data/node/10395'))
    (Node('http://localhost:7474/db/data/node/10396'), Node('http://localhost:7474/db/data/node/10397'))
    (Node('http://localhost:7474/db/data/node/10398'), Node('http://localhost:7474/db/data/node/10399'))
    (Node('http://localhost:7474/db/data/node/10400'), Node('http://localhost:7474/db/data/node/10401'))
    (Node('http://localhost:7474/db/data/node/10402'), Node('http://localhost:7474/db/data/node/10403'))
    (Node('http://localhost:7474/db/data/node/10404'), Node('http://localhost:7474/db/data/node/10405'))

========= What is wrong?

回答1:

I am not a py2neo expert, so can't help with that. However, have you tried using a different mechanism to create your graph? Since it is not very big, I would consider using a spreadsheet (I use that a lot) - it's dead easy.

See http://blog.neo4j.org/2013/03/importing-data-into-neo4j-spreadsheet.html for some more info.

Hope it makes sense.

Rik



回答2:

I think there is nothing wrong, your code looks good.

You print the nodes and get proper py2neo node instances. Try print(titulo, autor, rel) to see if your relationship is also created.

Just check with the webinterface at http://localhost:7474/webadmin/ if your data is there. Since you don't have too many nodes, you could try a simple cypher query to get all nodes and check if everything is ok.

START n=node(*) RETURN n;