What is a good way of hashing a hierarchy (similar to a file structure) in python?
I could convert the whole hierarchy into a dotted string and then hash that, but is there a better (or more efficient) way of doing this without going back and forth all the time?
An example of a structure I might want to hash is:
a -> b1 -> c -> 1 -> d
a -> b2 -> c -> 2 -> d
a -> c -> 1 -> d
If you have access to your hierarchy components as a tuple, just hash it - tuples are hashable. You may not gain a lot over conversion to and from a delimited string, but it's a start.
If this doesn't help, perhaps you could provide more information about how you store the hierarchy/path information.
How do you want to access your hierarchy?
If you're always going to be checking for a full path, then as suggested, use a tuple: eg:
However, if you're going to be doing things like "quickly find all the items below "a -> b1", it may make more sense to store it as a nested hashtable (otherwise you must iterate through all items to find those you're intereted in).
For this, a defaultdict is probably the easiest way to store. For example:
You can make any object hashable by implementing the
__hash__()
methodSo you can simply add a suitable
__hash__()
method to the objects storing your hierarchy, e.g. compute the hash recursively, etc.