I am struggling to process a nested dictionary, and return the nested Parent Keys, for a specific Value, when the Value may exist more than once in the nested dictionary.
For example:
example_dict = { 'key1' : 'value1',
'key2' : 'value2',
'key3' : { 'key3a': 'value3a' },
'key4' : { 'key4a': { 'key4aa': 'value4aa',
'key4ab': 'value4ab',
'key4ac': 'value1'},
'key4b': 'value4b'}
}
You will notice that 'value1' appears twice in the above dictionary, and I would like to create a function that returns either a single list, or a series of lists, that identify the different Parent Keys, which in this case would be 'key1' and ('key4', 'key4a', key4ac).
This type of problem was dealt with elsewhere on this site, when the Value one was looking for only appeared once, and was readily handled by the following recursive function:
def find_key(d,key):
for k,v in d.items():
if isinstance(v,dict):
p = find_key(v,key)
if p:
return [k] + p
elif v == key:
return [k]
print find_key(example_dict,'value4ac').
If you run the above code on the dictionary, I only get one answer for the parent keys.
Any help would be much appreciated,
Thanks!
Unless you're just doing a single search (or you're incredibly constrained on memory but have CPU time to burn…), you'll want to build a reverse-lookup dictionary, and then you can just use that.
To make that easier, I'm going to do it in two steps. First, convert a nested dictionary to a key-path dictionary:
def keypaths(nested):
for key, value in nested.iteritems():
if isinstance(value, collections.Mapping):
for subkey, subvalue in keypaths(value):
yield [key] + subkey, subvalue
else:
yield [key], value
Print out list(keypaths(example_dict))
if it isn't obvious what this does.
Now, how do you create a reverse dictionary? For a one-to-one-mapping, you can just do this:
reverse_dict = {value: keypath for keypath, value in keypaths(example_dict)}
But for a many-to-one mapping like yours, the reverse is one-to-many, so we need to map each value to a list of keys. So:
reverse_dict = {}
for keypath, value in keypaths(example_dict):
reverse_dict.setdefault(value, []).append(keypath)
And now you don't need anything fancy; just do a normal dict lookup on reverse_dict
:
>>> reverse_dict['value2']
[('key2',)]
>>> reverse_dict['value1']
[('key1',), ('key4', 'key4a', 'key4ac')]
>>> reverse_dict['value3']
KeyError: 'value3'
If you'd prefer the last one to return []
instead of raising a KeyError
, you can use a defaultdict(list)
instead of a plain dict
, and then you don't need setdefault
.
At any rate, the time taken to construct this reverse mapping is only a little longer than the time taken to do a single search by brute force, so if you're doing 100 searches, it'll be nearly 100x faster this way, as well as simpler.
Here is one solution:
from copy import copy
example_dict = { 'key1' : 'value1',
'key2' : 'value2',
'key3' : { 'key3a': 'value3a' },
'key4' : { 'key4a': { 'key4aa': 'value4aa',
'key4ab': 'value4ab',
'key4ac': 'value1'},
'key4b': 'value4b'}
}
result = []
path = []
def get_keys(d, target):
for k, v in d.iteritems():
path.append(k)
if isinstance(v, dict):
get_keys(v, target)
if v == target:
result.append(copy(path))
path.pop()
Result:
>>> get_keys(example_dict, 'value1')
>>> result
[['key1'], ['key4', 'key4a', 'key4ac']]