Can someone help explain how can building a heap be O(n) complexity?
Inserting an item into a heap is O(log n)
, and the insert is repeated n/2 times (the remainder are leaves, and can't violate the heap property). So, this means the complexity should be O(n log n)
, I would think.
In other words, for each item we "heapify", it has the potential to have to filter down once for each level for the heap so far (which is log n levels).
What am I missing?
In case of building the heap, we start from height, logn -1 (where logn is the height of tree of n elements). For each element present at height 'h', we go at max upto (logn -h) height down.
While building a heap, lets say you're taking a bottom up approach.
@bcorso has already demonstrated the proof of the complexity analysis. But for the sake of those still learning complexity analysis, I have this to add:
The basis of your original mistake is due to a misinterpretation of the meaning of the statement, "insertion into a heap takes O(log n) time". Insertion into a heap is indeed O(log n), but you have to recognise that n is the size of the heap during the insertion.
In the context of inserting n objects into a heap, the complexity of the ith insertion is O(log n_i) where n_i is the size of the heap as at insertion i. Only the last insertion has a complexity of O (log n).
Proof of O(n)
The proof isn't fancy, and quite straightforward, I only proved the case for a full binary tree, the result can be generalized for a complete binary tree.
Basically, work is done only on non-leaf nodes while building a heap...and the work done is the amount of swapping down to satisfy heap condition...in other words (in worst case) the amount is proportional to the height of the node...all in all the complexity of the problem is proportional to the sum of heights of all the non-leaf nodes..which is (2^h+1 - 1)-h-1=n-h-1= O(n)
It would be O(n log n) if you built the heap by repeatedly inserting elements. However, you can create a new heap more efficiently by inserting the elements in arbitrary order and then applying an algorithm to "heapify" them into the proper order (depending on the type of heap of course).
See http://en.wikipedia.org/wiki/Binary_heap, "Building a heap" for an example. In this case you essentially work up from the bottom level of the tree, swapping parent and child nodes until the heap conditions are satisfied.