可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
In short, I'd like to learn/develop an elegant method to save a binary tree to disk (a general tree, not necessarily a BST). Here is the description of my problem:
I'm implementing a game of "20-questions". I've written a binary tree whose internal nodes are questions and leaves are answers. The left child of a node is the path you'd follow if somebody answered "yes" to your current question, while the right child is a "no" answer. Note this is not a binary search tree, just a binary tree whose left child is "yes" and right is "no".
The program adds a node to a tree if it encounters a leaf that is null by asking the user to distinguish her answer from the one the computer was thinking of.
This is neat, because the tree builds itself up as the user plays. What's not neat is that I don't have a good way of saving the tree to disk.
I've thought about saving the tree as an array representation (for node i, left child is 2i+1, and 2i+2 right, (i-1)/2 for parent), but it's not clean and I end up with a lot of wasted space.
Any ideas for an elegant solution to saving a sparse binary tree to disk?
回答1:
You can store it recursively:
void encodeState(OutputStream out,Node n) {
if(n==null) {
out.write("[null]");
} else {
out.write("{");
out.write(n.nodeDetails());
encodeState(out, n.yesNode());
encodeState(out, n.noNode());
out.write("}");
}
}
Devise your own less texty output format. I'm sure I don't need to describe the method to read the resulting output.
This is depth-first traversal. Breadth-first works too.
回答2:
I would do a Level-order traversal. That is to say you are basically doing a Breadth-first search algorithm.
You have:
- Create a qeueue with the root element inserted into it
- Dequeue an element from the queue, call it E
- Add the left and right children of E into the queue. If there is no left or right, just put a null node representation.
- write node E to disk.
- Repeat from step 2.
Level-order traversal sequence: F, B, G, A, D, I, C, E, H
What you will store on disk: F, B, G, A, D, NullNode, I, NullNode, NullNode, C, E, H, NullNode
Loading it back from disk is even easier. Simply read from left to right the nodes you stored to disk. This will give you each level's left and right nodes. I.e. the tree will fill in from top to bottom left to right.
Step 1 reading in:
F
Step 2 reading in:
F
B
Step 3 reading in:
F
B G
Step 4 reading in:
F
B G
A
And so on ...
Note: Once you have a NULL node representation, you no longer need to list its children to disk. When loading back you will know to skip to the next node. So for very deep trees, this solution will still be efficient.
回答3:
A simple way to accomplish this is to traverse the tree outputting each element as you do so. Then to load the tree back, simply iterate through your list, inserting each element back into the tree. If your tree isn't self balancing, you may want to reorder the list in such a way that the final tree is reasonably balanced.
回答4:
Not sure it's elegant, but it's simple and explainable:
Assign a unique ID to each node, whether stem or leaf. A simple counting integer will do.
When saving to disk, traverse the tree, storing each node ID, "yes" link ID, "no" link ID, and the text of the question or answer. For null links, use zero as the null value. You could either add a flag to indicate whether question or answer, or more simply, check whether both links are null. You should get something like this:
1,2,3,"Does it have wings?"
2,0,0,"a bird"
3,4,0,"Does it purr?"
4,0,0,"a cat"
Note that if you use the sequential integers approach, saving the node's ID may be redundant, as shown here. You could just put them in order by ID.
To restore from disk, read a line, then add it to the tree. You will probably need a table or array to hold forward-referenced nodes, e.g. when processing node 1, you'll need to keep track of 2 and 3 until you can fill in those values.
回答5:
The most arbitrary simple way is just a basic format that can be used to represent any graph.
<parent>,<relation>,<child>
Ie:
"Is it Red", "yes", "does it have wings"
"Is it Red", "no" , "does it swim"
There isn't much redundancy here, and the formats mostly human readable, the only data duplication is that there must be a copy of a parent for every direct child it has.
The only thing you really have to watch is that you don't accidentally generate a cycle ;)
Unless that's what you want.
The problem here is rebuilding the
tree afterwards. If I create the "does
it have wings" object upon reading the
first line, I have to somehow locate
it when I later encounter the line
reading "does it have
wings","yes","Has it got a beak?"
This is why I traditionally just use graph structures in memory for such a thing with pointers going everywhere.
[0x1111111 "Is It Red" => [ 'yes' => 0xF752347 , 'no' => 0xFF6F664 ],
0xF752347 "does it have wings" => [ 'yes' => 0xFFFFFFF , 'no' => 0x2222222 ],
0xFF6F664 "does it swim" => [ 'yes' => "I Dont KNOW :( " , ... etc etc ]
Then the "child/parent" connectivity is merely metadata.
回答6:
In java if you were to make a class serializeable you can just write the class object to disc and read it back using input/output streams.
回答7:
I would store the tree like this:
<node identifier>
node data
[<yes child identfier>
yes child]
[<no child identifier>
no child]
<end of node identifier>
where the child nodes are just recursive instances of the above. The bits in [] are optional and the four identifiers are just constants/enum values.
回答8:
Here is the C++ code using PreOrder DFS:
void SaveBinaryTreeToStream(TreeNode* root, ostringstream& oss)
{
if (!root)
{
oss << '#';
return;
}
oss << root->data;
SaveBinaryTreeToStream(root->left, oss);
SaveBinaryTreeToStream(root->right, oss);
}
TreeNode* LoadBinaryTreeFromStream(istringstream& iss)
{
if (iss.eof())
return NULL;
char c;
if ('#' == (c = iss.get()))
return NULL;
TreeNode* root = new TreeNode(c, NULL, NULL);
root->left = LoadBinaryTreeFromStream(iss);
root->right = LoadBinaryTreeFromStream(iss);
return root;
}
In main()
, you can do:
ostringstream oss;
root = MakeCharTree();
PrintVTree(root);
SaveBinaryTreeToStream(root, oss);
ClearTree(root);
cout << oss.str() << endl;
istringstream iss(oss.str());
cout << iss.str() << endl;
root = LoadBinaryTreeFromStream(iss);
PrintVTree(root);
ClearTree(root);
/* Output:
A
B C
D E F
G H I
ABD#G###CEH##I##F##
ABD#G###CEH##I##F##
A
B C
D E F
G H I
*/
The DFS is easier to understand.
*********************************************************************************
But we can use level scan BFS using a queue
ostringstream SaveBinaryTreeToStream_BFS(TreeNode* root)
{
ostringstream oss;
if (!root)
return oss;
queue<TreeNode*> q;
q.push(root);
while (!q.empty())
{
TreeNode* tn = q.front(); q.pop();
if (tn)
{
q.push(tn->left);
q.push(tn->right);
oss << tn->data;
}
else
{
oss << '#';
}
}
return oss;
}
TreeNode* LoadBinaryTreeFromStream_BFS(istringstream& iss)
{
if (iss.eof())
return NULL;
TreeNode* root = new TreeNode(iss.get(), NULL, NULL);
queue<TreeNode*> q; q.push(root); // The parents from upper level
while (!iss.eof() && !q.empty())
{
TreeNode* tn = q.front(); q.pop();
char c = iss.get();
if ('#' == c)
tn->left = NULL;
else
q.push(tn->left = new TreeNode(c, NULL, NULL));
c = iss.get();
if ('#' == c)
tn->right = NULL;
else
q.push(tn->right = new TreeNode(c, NULL, NULL));
}
return root;
}
In main()
, you can do:
root = MakeCharTree();
PrintVTree(root);
ostringstream oss = SaveBinaryTreeToStream_BFS(root);
ClearTree(root);
cout << oss.str() << endl;
istringstream iss(oss.str());
cout << iss.str() << endl;
root = LoadBinaryTreeFromStream_BFS(iss);
PrintVTree(root);
ClearTree(root);
/* Output:
A
B C
D E F
G H I
ABCD#EF#GHI########
ABCD#EF#GHI########
A
B C
D E F
G H I
*/