Median of medians algorithm: why divide the array

2019-06-19 23:02发布

In median-of-medians algorithm, we need to divide the array into chunks of size 5. I am wondering how did the inventors of the algorithms came up with the magic number '5' and not, may be, 7, or 9 or something else?

3条回答
Luminary・发光体
2楼-- · 2019-06-19 23:28

I think that if you'll check "Proof of O(n) running time" section of wiki page for medians-of-medians algorithm:

The median-calculating recursive call does not exceed worst-case linear behavior because the list of medians is 20% of the size of the list, while the other recursive call recurses on at most 70% of the list, making the running time

Image

The O(n) term c n is for the partitioning work (we visited each element a constant number of times, in order to form them into n/5 groups and take each median in O(1) time). From this, using induction, one can easily show that

Image

That should help you to understand, why.

查看更多
smile是对你的礼貌
3楼-- · 2019-06-19 23:28

The number has to be larger than 3 (and an odd number, obviously) for the algorithm. 5 is the smallest odd number larger than 3. So 5 was chosen.

查看更多
Explosion°爆炸
4楼-- · 2019-06-19 23:29

You can also use blocks of size 3 or 4, as shown in the paper Select with groups of 3 or 4 by K. Chen and A. Dumitrescu (2015). The idea is to use the "median of medians" algorithm twice and partition only after that. This lowers the quality of the pivot but is faster.

So instead of:

T(n) <= T(n/3) + T(2n/3) + O(n)
T(n) = O(nlogn)

one gets:

T(n) <= T(n/9) + T(7n/9) + O(n)
T(n) = Theta(n)
查看更多
登录 后发表回答