Why can't the median-of-medians algorithm use

2019-02-21 17:21发布

问题:

I am Working through the analysis of deterministic median finding under the assumption that the input is divided into 3 parts rather than 5 and the question is Where does it break down?

the deterministic median finding algorithm:

SELECT(i, n)

  1. Divide the n elements into groups of 5. Find the median of each 5-element group by rote.

  2. Recursively SELECT the median x of the ⎣n/5⎦ group medians to be the pivot.

  3. Partition around the pivot x. Let k = rank(x)

4.if i = k then return x

elseif i < k

then recursively SELECT the ith smallest element in the lower part

else recursively SELECT the (i–k)th smallest element in the upper part

I went through the analysis of the algorithm and I believe that Step 1 and 3 will take O(n) where it just takes just constant time to find the median of 5 elements and Step2 takes T(n/5).so at least 3/10 of elements are ≤ p, and at least 3/10 of the array is ≥ p, therefore,Step 4 will T(7n/10) and will get the recurrence. T(n) ≤ cn + T(n/5) + T(7n/10), but when I divide the elements in goroup of 3, let say the 9 elements and I divided them in group such that:

{1,2,10} {4,11,14}, {15,20,22}

I got the medians 2,11,20 and the p=11.

in general in group of five lets say g = n/5 groups and at least ⌈g/2⌉ of them (those groups whose median is ≤ p) at least three of the five elements are ≤ p. so the total number of elements ≤ p is at least 3⌈g/2⌉ ≥ 3n/10. BUT in group of 3 we can get all the three elements might be LESS than p. and here I think the algorithm will break down !!

Did I get the idea correct???

回答1:

In a group of 3, as for the groups of 5, about half of the groups will have their median element less than the median-of-medians, so in those groups you can discard elements less than their median. In your case, (1,2,10) has its median less than 11, so you can discard 1 and 2.

Where I think things break down for groups of 3 is in the costing. 3(floor(floor(n/5)/2 - 2) which is roughly 3n/10 becomes 2(floor(floor(n/3)/2 -2) or so, which is roughly n/3. This means that 7n/10 becomes 2n/3. floor(n/5) becomes floor(n/3), so instead of 7cn/10 + 2cn/10 = 9cn/10 you are going to get just 2cn/3 + cn/3 = cn, and instead of T(n) <= cn you are going to have something where you will have to look closely at the terms that don't involve c, and you might end up showing that it is not linear after all.

It looks like you actually get to throw away slightly more elements at each stage of the recursion, but the recursion divides the amount of work left by 3, not 5, and this isn't enough to break even.