What are the benefits of a binary heap with root a

2020-01-30 02:48发布

问题:

I am writing a binary heap over an array arr.

Every node except the leaf nodes have two children.

The root can be at arr[0] or arr[1].

The accepted answer at Why in a heap implemented by array the index 0 is left unused? says arr[1] is faster.

But one comment below that answer says most implementation put root at arr[0].

What are the benefits of putting the root at arr[0]?

回答1:

I am the person who answered the question you linked.

Creating a binary heap that has the root at arr[1] in a language that has 0-based arrays is idiotic. Not because it wastes a trivial amount of space, but because it creates unnecessarily confusing code for no benefit.

Why is the code confusing? Because it breaks a fundamental rule that we as programmers have been working under for years: arrays start at 0. If you want an array that holds 100 items, you declare it that way:

int a[100];

Except for a binary heap. Because some idiot who converted the original binary heap code from Algol (whose arrays are 1-based) to C (0-based arrays) back in 1973 didn't have the brains to change the child and parent calculations, we've ended up with this one special case where to hold 100 items you have to allocate 101:

int a[101];

And when somebody called that person on the inconsistency, he came up with a specious performance argument.

Yes, there is an extra increment instruction in the code for computing the left child index, and an extra decrement instruction when computing a child's parent index. In the wider context of what a binary heap does, those two instructions will make no practical difference to the running time of any program that uses the heap. None. If the difference is even measurable, it will definitely be noisy. There are many other things happening on your computer that will have much larger effects on your program's performance.

If you're writing a program that requires a high performance priority queue, what the heck are you doing with a binary heap in the first place? If you're really going to store huge numbers of things in your priority queue, you probably should be using something like a Pairing heap, which will outperform binary heap, although at a higher memory cost.

The C++ STL priority_queue, the Java PriorityQueue, and python's heapq all use 0-based binary heaps. The people who wrote those packages understand performance considerations. If there was a significant performance gain to going with a 1-based binary heap, they would have done so. That they went with 0-based heaps should tell you that any performance gain from a 1-based heap is illusory.

See http://blog.mischel.com/2016/09/19/but-thats-the-way-weve-always-done-it/ for a more complete discussion.



标签: c++ heap