I'm trying to accomplish a pretty common task. I have a substantial dataset in a Neo4J database and, from a RESTful web service, i want to return the data in chunks of 25 nodes. My model is quite simple:
(:Tenant {Hash:''})-[:owns]->(:Asset {Hash:'', Name:''})
I have unique constraints on the Hash
properties on both labels.
If i wanted to obtain the 101th data page, my cypher query would look like this:
MATCH (:Tenant {Hash:'foo'})-[:owns]->(a:Asset)
RETURN a
ORDER BY a.Hash
SKIP 2500
LIMIT 25
My dataset consists of a single tenant, with ~75K assets. The above query takes ~30(!) seconds to complete. I also notice is that the further i advance in the data (ie. higher SKIP
) the longer it takes for the query to return.
I quickly figured out that the culprit of my performance issues is the ORDER BY a.Hash
. When i remove it, the query returns with sub-second results. This is actually quite a surprise, as i'd expect the index itself to also be ordered.
Obviously, in order to implement sensible pagination, i must have a consistent sort order.
- Any tips on making this query perform?
- Alternative suggestions for paging? I can see adding dedicated page nodes, but that will become difficult to maintain.
- What is the default sort order anyway, and is it consistent?
Hey @GeoffreyBraaf found some time this week to look at your issue, you are right, there were some implementation issue that made this unnecessarily slow.
I used Timmy's suggestion to implement a Java version which finished in 30ms. The Cypher version took 100 seconds. Working on implementation of the top-n select in Cypher improved it massively by a factor of 600. So Cypher now takes about 150ms for that query.
See: https://gist.github.com/jexp/9954509
The work is already merged in 2.0-maint and will be released as part of 2.0.2
See: https://github.com/neo4j/neo4j/pull/2230
Although I'm actually not a big fan of my own solution, but you get surprisingly good results when you code the retrieval and paging of nodes, even with relatively large datasets. If you're using java, could you give the following a spin?