I am currently using embedded python binding for neo4j. I do not have any issues currently since my graph is very small (sparse and upto 100 nodes). The algorithm I am developing involves quite a lot of traversals on the graph, more specifically DFS on the graph in general as well as on different subgraphs.
In the future I intend to run the algorithm on large graphs (supposedly sparse and with millions of nodes).
Having read different threads related to the performance of python/neo4j bindings here, here, I wonder whether I should already switch to some REST API client for Python (like bulbflow, py2neo, neo4jrestclient) until I am too far to change all code.
Unfortunately, I did not find any comprehensive source of information to compare different approaches.
Could anyone provide some further insight into this issue? Which criteria should I take into account when choosing one of the options?
Django is an MVC web framework so you may be interested in that if yours is to be a web application.
From the point of view of py2neo (of which I am the author), I am trying to focus hard on performance by using the batch execution mechanism automatically where appropriate as well as providing strong Cypher support. I have also recently put a lot of work into providing good options for uniqueness management within indexes - specifically, the get_or_create
and add_if_none
methods.
The easiest way to run algorithms from Python is to use Gremlin (https://github.com/tinkerpop/gremlin/wiki).
With Gremlin you can bundle everything into one HTTP request to reduce round-trip overhead.
Here's how to execute Gremlin scripts from Bulbs (http://bulbflow.com):
>>> from bulbs.neo4jserver import Graph
>>> g = Graph()
>>> script = "g.v(id).out('knows').out('knows')"
>>> params = dict(id=3)
>>> g.gremlin.execute(script, params)
The Bulbs Gremlin API docs are here: http://bulbflow.com/docs/api/bulbs/gremlin/
Not really sure, I am not an expert, but I think it also depends on your Django expectations, and how much of a framework you need. Py2neo is very pragmatic and slim, Bulbflow seems to build up a whole mapping stack etc, and neo4jrestclient is concentrating on Django (that may be wrong)?