I have hierarchical data stored in the datastore using a model which looks like this:
class ToolCategories(db.Model):
name = db.StringProperty()
parentKey = db.SelfReferenceProperty(collection_name="parent_category")
...
...
I want to print all the category names preserving the hierarchy, say in some form like this :
--Information Gathering
----OS Fingerprinting
----DNS
------dnstool
----Port Scanning
------windows
--------nmap
----DNS3
----wireless sniffers
------Windows
--------Kismet
To do the above I have used simple recursion using the back referencing capability:
class GetAllCategories (webapp.RequestHandler) :
def RecurseList(self, object, breaks) :
output = breaks + object.name + "</br>"
for cat in object.parent_category:
output = output + self.RecurseList(cat, breaks + "--")
return output
def get (self) :
output = ""
allCategories = ToolCategories.all().filter(' parentKey = ', None)
for category in allCategories :
output = output + self.RecurseList(category, "--")
self.response.out.write(output)
As I am very new to App engine programming (hardly 3 days since I started writing code), I am not sure if this the most optimized way from the Datastore access standpoint to do the desired job.
Is this the best way? if not what is?
You have a very reasonable approach! My main caveat would be one having little to do with GAE and a lot with Python: don't build a string from pieces with
+
or+=
. Rather, you make a list of string pieces (withappend
orextend
or list comprehensions &c) and when you're all done you join it up for the final string result with''.join(thelist)
or the like. Even though recent Python versions strive hard to optimize the intrinsicallyO(N squared)
performance of the+
or+=
loops, in the end you're always better off building up lists of strings along the way and''.join
ing them up at the very end!The main disadvantage of your approach is that because you're using the "adjacency list" way of representing trees, you have to do one datastore query for each branch of the tree. Datastore queries are fairly expensive (around 160ms each), so constructing the tree, particularly if it's large, could be rather expensive).
There's another approach, which is essentially the one taken by the datastore for representing entity groups: Instead of just storing the parent key, store the entire list of ancestors using a ListProperty:
Then, to construct the tree, you can retrieve the entire thing in one single query: