Why is the median-of-medians algorithm described a

2019-04-03 23:54发布

问题:

Wikipedia lists the median-of-medians algorithm as requiring O(1) auxiliary space.

However, in the middle of the algorithm, we make a recursive call on a subarray of size n/5 to find the median of medians. When this recursive call returns, we use the returned median of medians as a pivot to partition the array.

Doesn't this algorithm push O(lg n) activation records onto the run-time stack as a part of the recursion? From what I can see, these recursive calls to find successive medians of medians cannot be tail-call optimized because we do extra work after the recursive calls return. Therefore, it seems like this algorithm requires O(lg n) auxiliary space (just like Quicksort, which Wikipedia lists as requiring O(lg n) auxiliary space due the space used by the run-time stack).

Am I missing something, or is the Wikipedia article wrong?

(Note: The recursive call I'm referring to is return select(list, left, left + ceil((right - left) / 5) - 1, left + (right - left)/10) on the Wikipedia page.)

回答1:

While I can't rule out that O(1) is possible, that Wikipedia information appears to be a mistake.

  • The implementation shown there takes O(log n) and not just O(1).
  • It's absolutely not obvious how to implement it with O(1) and there's no explanation/reference for it.
  • I asked the author who originally added that information and he replied that he doesn't remember and that it's probably a mistake.
  • A paper from 2005 devoted to solving the selection problem with O(n) time and O(1) extra space says BFPRT (aka Median of Medians) uses Θ(log n) extra space. On the other hand, the paper's main result is that O(n) time and O(1) extra space is possible, and one of the two algorithms presented as proof is some "emulation" of BFPRT. So in that sense it's possible, but I think the emulation rather makes it a different algorithm and the O(1) shouldn't be attributed to "regular" BFPRT. At least not without explanation.
    (Thanks to Yu-HanLyu for showing that paper and more in the comments)


回答2:

It is O(1).

Let us say we start with an array of length n, and we intend to find the kth element of the sorted list.

After the first call median of medians will spit out a smaller array, now we need to evaluate the ith element of this smaller array. Note that the ith element of this smaller array is the result, so i do not need to pass back any information to the earlier call.

In quick sort, i need to put back the sorted small arrays back into the correct position and so recursion occurs. With median of medians, after the loop (tail recursion), i will be left with the answer.

Recursion depth = O(1)