I looked in to the C++0x standard and found the requirement that make_heap should do no more than 3*N comparisons.
I.e. heapify an unordered collection can be done in O(N)
/* @brief Construct a heap over a range using comparison functor.
Why is this?
The source gives me no clues (g++ 4.4.3)
The while (true) + __parent == 0 are not clues but rather a guess for O(N) behaviour
template<typename _RandomAccessIterator, typename _Compare>
void
make_heap(_RandomAccessIterator __first, _RandomAccessIterator __last,
_Compare __comp)
{
const _DistanceType __len = __last - __first;
_DistanceType __parent = (__len - 2) / 2;
while (true)
{
_ValueType __value = _GLIBCXX_MOVE(*(__first + __parent));
std::__adjust_heap(__first, __parent, __len, _GLIBCXX_MOVE(__value),
__comp);
if (__parent == 0)
return;
__parent--;
}
}
__adjust_heap looks like a log N method:
while ( __secondChild < (__len - 1) / 2)
{
__secondChild = 2 * (__secondChild + 1);
Is a bog standard log N to me.
template<typename _RandomAccessIterator, typename _Distance,
typename _Tp, typename _Compare>
void
__adjust_heap(_RandomAccessIterator __first, _Distance __holeIndex,
_Distance __len, _Tp __value, _Compare __comp)
{
const _Distance __topIndex = __holeIndex;
_Distance __secondChild = __holeIndex;
while (__secondChild < (__len - 1) / 2)
{
__secondChild = 2 * (__secondChild + 1);
if (__comp(*(__first + __secondChild),
*(__first + (__secondChild - 1))))
__secondChild--;
*(__first + __holeIndex) = _GLIBCXX_MOVE(*(__first + __secondChild));
__holeIndex = __secondChild;
}
if ((__len & 1) == 0 && __secondChild == (__len - 2) / 2)
{
__secondChild = 2 * (__secondChild + 1);
*(__first + __holeIndex) = _GLIBCXX_MOVE(*(__first
+ (__secondChild - 1)));
__holeIndex = __secondChild - 1;
}
std::__push_heap(__first, __holeIndex, __topIndex,
_GLIBCXX_MOVE(__value), __comp);
}
Any clues to why this is O <= 3N will be appreciated.
EDIT:
Experimental results:
This actual implementation uses
- <2N comparisons for heapifying heaps
- <1.5N for heapifying heaps in the reverse order.
A binary heap over n elements can be created in O(n) time using a clever algorithm and a clever analysis. In what follows I'm just going to talk about how this works assuming that you have explicit nodes and explicit left and right child pointers, but this analysis is still perfectly valid once you compress it into an array.
The algorithm works as follows. Start off by taking about half of the nodes and treating them as singleton max-heaps - since there's only one element, the tree containing just that element must automatically be a max-heap. Now, take these trees and pair them off with one another. For each pair of trees, take one of the values that you haven't used yet and execute the following algorithm:
Make the new node the root of the heap, having its left and right child pointers refer to the two max-heaps.
While this node has a child that's larger than it, swap the child with its larger child.
My claim is that this procedure ends up producing a new max heap containing the elements of the two input max-heaps, and it does so in time O(h), where h is the height of the two heaps. The proof is an induction on the height of the heaps. As a base case, if the subheaps have size zero, then the algorithm terminates immediately with a singleton max-heap, and it does so in O(1) time. For the inductive step, assume that for some h, this procedure works on any subheaps of size h and consider what happens when you execute it on two heaps of size h + 1. When we add a new root to join together two subtrees of size h + 1, there are three possibilities:
The new root is larger than the roots of both subtrees. Then in this case we have a new max-heap, since the root is larger than any of the nodes in either subtree (by transitivity)
The new root is larger than one child and smaller than the other. Then we swap the root with the larger subchild and recursively execute this procedure again, using the old root and the child's two subtrees, each of which are of height h. By the inductive hypothesis, this means that the subtree we swapped into is now a max-heap. Thus the overall heap is a max-heap, since the new root is larger than everything in the subtree we swapped with (since it's larger than the node we added and was already larger than everything in that subtree), and it's also larger than everything in the other subtree (since it's larger than the root and the root was larger than everything in the other subtree).
The new root is smaller than both its children. Then using a slightly modified version of the above analysis, we can show that the resulting tree is indeed a heap.
Moreover, since at each step the heights of the child heaps decreases by one, the overall runtime for this algorithm must be O(h).
At this point, we have a simple algorithm for making a heap:
- Take about half the nodes and create singleton heaps. (You can compute explicitly how many nodes will be needed here, but it's about half).
- Pair those heaps off, then merge them together by using one of the unused nodes and the above procedure.
- Repeat step 2 until a single heap remains.
Since at each step we know that the heaps we have so far are valid max-heaps, eventually this produces a valid overall max-heap. If we're clever with how we pick how many singleton heaps to make, this will end up creating a complete binary tree as well.
However, it seems like this should run in O(n lg n) time, since we do O(n) merges, each of which runs in O(h), and in the worst case the height of the trees we're merging is O(lg n). But this bound is not tight and we can do a lot better by being more precise with the analysis.
In particular, let's think about how deep all the trees we merge are. About half the heaps have depth zero, then half of what's left has depth one, then half of what's left has depth two, etc. If we sum this up, we get the sum
0 * n/2 + 1 * n/4 + 2 * n/8 + ... + nk/(2k) = Σk = 0⌈log n⌉ (nk / 2k) = n Σk = 0⌈log n⌉ (k / 2k+1)
This upper-bounds the number of swaps made. Each swap requires at most two comparisons. Therefore, if we multiply the above sum by two, we get the following summation, which upper-bounds the number of swaps made:
n Σk = 0∞ (k / 2k)
The summation here is the summation 0 / 20 + 1 / 21 + 2 / 22 + 3 / 23 + ... . This is a famous summation that can be evaluated in multiple different ways. One way to evaluate this is given in these lecture slides, slides 45-47. It ends up coming out to exactly 2n, which means that the number of comparisons that end up getting made is certainly bounded from above by 3n.
Hope this helps!
@templatetypedef has already given a good answer for why the asymptotic run time of build_heap
is O(n). There is also a proof in chapter 6 of CLRS, 2nd edition.
As for why the C++ standard requires that at most 3n comparisons are used:
From my experiments (see code below), it appears that actually less than 2n comparisons are needed. In fact, these lecture notes contain a proof that build_heap
only uses 2(n-⌈log n⌉) comparisons.
The bound from the standard seems to be more generous than required.
def parent(i):
return i/2
def left(i):
return 2*i
def right(i):
return 2*i+1
def heapify_cost(n, i):
most = 0
if left(i) <= n:
most = 1 + heapify_cost(n, left(i))
if right(i) <= n:
most = 1 + max(most, heapify_cost(n, right(i)))
return most
def build_heap_cost(n):
return sum(heapify_cost(n, i) for i in xrange(n/2, 1, -1))
Some results:
n 10 20 50 100 1000 10000
build_heap_cost(n) 9 26 83 180 1967 19960