Continued to my older question: Writing nested dictionary (forest) of a huge depth to a text file
Now I want to write the forest traversal in the BFS style: I have a huge depth dictionary that represents forest (many non-binary trees) which I want to process the forest and create a text file with the sequences of (father, son) relations from the forest, i.e. given the dictionary:
{'a': {'b': {'c': {'x': {}}, 'd': {'p': {}}}, 'g': {}, 'f': {}},
't': {'r': {'o': {}}, 'y': {}}}
the generated text file will look like:
(ROOT,b) (ROOT,g) (ROOT,f) (b,c) (b,d) (c,x) (d,p) \n
(ROOT,r) (ROOT,y) (r,o) \n
Note that I replaces all the roots in the forest with the word "ROOT".
Here is a simple visualization of the forest:
The nested dictionary is big and iterating over it recursively makes a memory run-time error, so a "Generator style" solution as in the link at the beginning of this question would be the best.
It is easiest to traverse the structure recursively with a generator:
File output:
This will work on large dictionaries:
Output:
This outputs:
To perform breadth-first-search, we must keep a list of the current working nodes and the trees below them - I chose to store these in a tuple.
For instance, when we are working at the depth of the
c
andd
nodes, this list of trees will be:Now while there are still trees below us (
while len(trees):
), we need to step down to the layer below in our tree.The fist step is obviously to reset the
trees
list since we will be generating the next layer.We then iterate over our list of trees and, for each tree, we iterate over its children.
So taking the above example, on the first iteration, the node would be
'c'
and the children would be:{'x': {}}
and we now want to iterate over the children. So on the first iteration of that children loop, the first child node will be'x'
and its children (c
's child's children) are empty:{}
.Now, in this scope (the node's child's), if the child has children we want to add the child and its children (again, as a tuple) to the list of trees.
So to give an example where are there are children, when the current node is
b
, then one of its children isc
and sincec
has children, a tuple of (c
,c
's children) is appended to the trees list for the next layer.Finally, no matter whether this child had children or not, we want to the current line in the file the link between us and them. This is
(node, child_node)
.And that's almost it. Of course, when we finish a whole tree, we need to write a new-line to the file.
The only annoying detail is the issue of spaces between the tuples written to the file. If we always concatenated a space to the end of each tuple, we would end up with a stray space on the end of each line, as seen below which isn't ideal.
(where
S
represents a space)So to redeem this, we will always concatenate a space before each tuple so long as we are not the first on a new line (
line_first
). For this to work, at the start of each tree (line), we set theline_first
flag toTrue
, but then in the code, we immediately set it toFalse
on the first iteration (but skip writing the space), otherwise (future tuples) we write a space before.And that's it. Here's the completed code:
warning: uses
f-strings
which were introduced in version 3.6!And it produces the intended output: