I am fetching multiple entities 100+ from datastore using the below Query
return entity.query(ancestor = ancestorKey).filter(entity.year= myStartYear).order(entity.num).fetch()
Which was taking a long time (order of a few seconds) to load.
Trying to find an optimum way, I created exactly 100 entities, found that it takes anywhere between 750ms ~ 1000ms to fetch the 100 entities on local server, which is a lot of course. I am not sure how to get around a single line fetch to make it more efficient!
In a desperate attempt to optimize, I tried
- Removing the order part, still got the same results
- Removing the filter part, still got the same results
- Removing the order & filter part, still got the same results
So apparently it is something else. In a desperate attempt, I tried fetching for keys only then passing the keys to ndb.get_multi() function:
qKeys = entity.query(ancestor = ancestorKey).filter(entity.year= myStartYear).order(entity.num).fetch(keys_only=True)
return ndb.get_multi(qKeys)
To my surprise I get a better throughput! query results now loads in 450 ~ 550ms which is around ~40% better performance on average!
I am not sure why this happens, I would have thought that the fetch function already queries entities in the most optimum time.
Question: Any idea how I can optimize the single query line to load faster?
Side Question: Anyone knows what's the underlying mechanism for the fetch function, and why fetching keys only, then using ndb.get_multi() is faster?
FWIW, you shouldn't expect meaningful results from datastore performance tests performed locally, using either the development server or the datastore emulator - they're just emulators, they don't have the same performance (or even the 100% equivalent functionality) as the real datastore.
Credit goes to @snakecharmerb, who correctly identified the culprit, confirmed by OP: