I have a question which is part of my program.
For a tree T=(V,E) we need to find the node v in the tree that minimize the length of the longest path from v to any other node.
so how we find the center of the tree? Is there can be only one center or more?
If anyone can give me good algorithm for this so i can get the idea on how i can fit into my program.
Consider a tree with two nodes? Which is the center? Either one will suffice, ergo a tree can have more than one center.
Now, think about what it means to be the center. If all of the branches are the same height, the center is the root (all paths go through the root). If the branches are of different heights then the center must be either the root or in the branch with the greatest height otherwise the maximum path is greater than the height of the tallest branch and the root would be a better choice. Now, how far down the tallest branch do we need to look? Half the difference in height between the tallest branch (from the root) and the next tallest branch (if the difference is at most 1 the root will suffice). Why, because as we go down the tallest branch by one level we are lengthening the path to the deepest node of the next tallest branch by one and reducing the distance to the deepest node in the current branch by one. Eventually, they will be equal when you've traversed half the difference in the depth. Now as we go down the tallest branch, we simply need to choose at each level the tallest sub-branch. If more than one has the same height, we simply choose one arbitrarily.
Essentially, what you are doing is finding the longest path in the graph, which is the distance between the tallest branch of the tree and the next tallest branch, then finding the middle node on that branch. Because there may be multiple paths of the same length as the longest path and the length of the longest path may be even, you have multiple possible centers.
There are two approaches to do this (both runs in the same time):
- using BFS (which I will describe here)
- using FIFO queue.
Select any vertex v1
on your tree. Run BFS from this vertex. The last vertex (v2
) you will proceed will be the furthest vertex from v1
. Now run another BFS, this time from vertex v2
and get the last vertex v3
.
The path from v2
to v3
is the diameter of the tree and your center lies somewhere on it. More precisely it lies in the middle of it. If the path has 2n + 1
points, there will be only 1 center (in the position n + 1
). If the path has 2n
points, there will be two centers at the positions n
and n + 1
.
You only use 2 BFS calls which runs in 2 * O(V)
time.
Rather than do this homework problem for you, I'm going to ask you through the thought process that gets the answer...
1) What would you do with the graph a-b-c (three vertices, two edges, and definitely acyclic)? Imagine for a moment that you have to put some labels on some of the vertices, you know you're going to get the minimum of the longest path on the "center" vertex. (b, with eventual label "1") But doing that in one step requires psychic powers. So ask yourself what b is 1 step away from. If the longest path to b is 1, and we've just stepped one step backwards along that path, what's the length of our path so far? (longest path = 1, -1 for the back one step. Aha: 0). So that must be the label for the leaves.
2) What does this suggest as a first cut for an algorithm? Mark the leaves "0", mark their upstreams "1", mark their upstreams "2" and so on. Marching in from the leaves and counting the distance as we go...
3) Umm... We have a problem with the graph a-b-c-d. (From now on, a labelled vertex will be replaced with its label.) Labeling the leaves "0" gives 0-b-c-0... We can't get two centers... Heck, what do we do in the simpler condition 0-b-1? We want to label b with both "1" and "2"... Handle those in reverse order...
In 0-b-1, if we extend the path from b's left by one, we get a path of length 1. If we extend the path from b's right, we get 2. We want to track "the length of the longest path from v to any other node", so we want to keep track of the longest path to b. That means we mark b with a "2".
0-b-1 -> 0-2-1
In 0-b-c-0, the computer doesn't actually simultaneously update b and c. It updates one of them, giving either 0-1-c-0 or 0-b-1-0, and the next update gives 0-1-2-0 or 0-2-1-0. Both b and c are "center"s of this graph since each of them meets the demand "the node v in the tree that minimize the length of the longest path from v to any other node." (That length is 2.)
This leads to another observation: The result of the computation is not to label a graph, it's to find the last vertex we label and/or the vertex that ends up with the largest label. (It's possible that we won't find a good way to order labels, so we'll end up needing to find the max at the end. Or maybe we will. Who knows.)
4) So now we have something like: Make two copies of the graph -- the labelled copy and the burn-down copy. The first one will store the labels and it's the one that will have the final answer in it. The burn-down copy will get smaller and smaller as we delete unlabeled vertices from it (to find new labelable vertices). (There are other ways to organize this so that only one copy of the graph is used. When you get to the end of understanding this answer, you should find a way to reduce this waste.) Outline:
label = 0
while the burndown graph is nonempty
collect all the leaves in the burndown-graph into the set X
for each leaf in the set X
if the leaf does not have a label
set the leaf's label (to the current value of label)
delete the leaf from the burn-down graph (these leafs are two copies of the same leaf in the input graph)
label = label+1
find the vertex with the largest label and return it
5) If you actually watch this run, you'll notice several opportunities for short-cutting. Including replacing the search on the last line of the outline with a much quicker method for recognizing the answer.
And now for general strategy tips for algorithm problems:
* Do a few small examples by hand. If you don't understand how to do the small cases, there's no way you can jump straight in and tell the computer how to do the large cases.
* If any of the above steps seemed unmotivated or totally opaque, you will need to study much, much harder to do well in Computer Science. It may be the Computer Science is not for you...