I have a dictionary like this:
{ "id" : "abcde",
"key1" : "blah",
"key2" : "blah blah",
"nestedlist" : [
{ "id" : "qwerty",
"nestednestedlist" : [
{ "id" : "xyz",
"keyA" : "blah blah blah" },
{ "id" : "fghi",
"keyZ" : "blah blah blah" }],
"anothernestednestedlist" : [
{ "id" : "asdf",
"keyQ" : "blah blah" },
{ "id" : "yuiop",
"keyW" : "blah" }] } ] }
Basically a dictionary with nested lists, dictionaries and strings, of arbitrary depth.
What is the best way of traversing this to extract the values of every "id" key? I want to achieve the equivalent of an XPath query like "//id". The value of "id" is always a string.
So from my example, the output I need is basically:
["abcde", "qwerty", "xyz", "fghi", "asdf", "yuiop"]
Order is not important.
I just wanted to iterate on @hexerei-software's excellent answer using
yield from
and accepting top-level lists.I found this Q/A very interesting, since it provides several different solutions for the same problem. I took all these functions and tested them with a complex dictionary object. I had to take two functions out of the test, because they had to many fail results and they did not support returning lists or dicts as values, which i find essential, since a function should be prepared for almost any data to come.
So i pumped the other functions in 100.000 iterations through the
timeit
module and output came to following result:All functions had the same needle to search for ('logging') and the same dictionary object, which is constructed like this:
All functions delivered the same result, but the time differences are dramatic! The function
gen_dict_extract(k,o)
is my function adapted from the functions here, actually it is pretty much like thefind
function from Alfe, with the main difference, that i am checking if the given object has iteritems function, in case strings are passed during recursion:So this variant is the fastest and safest of the functions here. And
find_all_items
is incredibly slow and far off the second slowestget_recursivley
while the rest, exceptdict_extract
, is close to each other. The functionsfun
andkeyHole
only work if you are looking for strings.Interesting learning aspect here :)
Here is my stab at it:
Ex.:
Another variation, which includes the nested path to the found results (note: this version doesn't consider lists):