How to find k nearest neighbors to the median of n

2019-03-11 05:37发布

I can use the median of medians selection algorithm to find the median in O(n). Also, I know that after the algorithm is done, all the elements to the left of the median are less that the median and all the elements to the right are greater than the median. But how do I find the k nearest neighbors to the median in O(n) time?

If the median is n, the numbers to the left are less than n and the numbers to the right are greater than n. However, the array is not sorted in the left or the right sides. The numbers are any set of distinct numbers given by the user.

The problem is from Introduction to Algorithms by Cormen, problem 9.3-7

11条回答
放荡不羁爱自由
2楼-- · 2019-03-11 05:42

You already know how to find the median in O(n)

if the order does not matter, selection of k smallest can be done in O(n) apply for k smallest to the rhs of the median and k largest to the lhs of the median

from wikipedia

 function findFirstK(list, left, right, k)
 if right > left
     select pivotIndex between left and right
     pivotNewIndex := partition(list, left, right, pivotIndex)
     if pivotNewIndex > k  // new condition
         findFirstK(list, left, pivotNewIndex-1, k)
     if pivotNewIndex < k
         findFirstK(list, pivotNewIndex+1, right, k)

don't forget the special case where k==n return the original list

查看更多
成全新的幸福
3楼-- · 2019-03-11 05:44

Four Steps:

  1. First find the median (Median of median) - O(n)
  2. Determine the absolute difference between median and each of the elements - O(n)
  3. Use the kth smallest element algorithm to get the result(Quickselect) - O(n)
  4. Now we need to pick the k closest out of the array - O(n)
查看更多
啃猪蹄的小仙女
4楼-- · 2019-03-11 05:44

First select the median in O(n) time, using a standard algorithm of that complexity. Then run through the list again, selecting the elements that are nearest to the median (by storing the best known candidates and comparing new values against these candidates, just like one would search for a maximum element).

In each step of this additional run through the list O(k) steps are needed, and since k is constant this is O(1). So the total for time needed for the additional run is O(n), as is the total runtime of the full algorithm.

查看更多
淡お忘
5楼-- · 2019-03-11 05:48

You could use a non-comparison sort, such as a radix sort, on the list of numbers L, then find the k closest neighbors by considering windows of k elements and examining the window endpoints. Another way of stating "find the window" is find i that minimizes abs(L[(n-k)/2+i] - L[n/2]) + abs(L[(n+k)/2+i] - L[n/2]) (if k is odd) or abs(L[(n-k)/2+i] - L[n/2]) + abs(L[(n+k)/2+i+1] - L[n/2]) (if k is even). Combining the cases, abs(L[(n-k)/2+i] - L[n/2]) + abs(L[(n+k)/2+i+!(k&1)] - L[n/2]). A simple, O(k) way of finding the minimum is to start with i=0, then slide to the left or right, but you should be able to find the minimum in O(log(k)).

The expression you minimize comes from transforming L into another list, M, by taking the difference of each element from the median.

m=L[n/2]
M=abs(L-m)

i minimizes M[n/2-k/2+i] + M[n/2+k/2+i].

查看更多
萌系小妹纸
6楼-- · 2019-03-11 05:48

Since all the elements are distinct, there can be atmost 2 elements with the same difference from the mean. I think it is easier for me to have 2 arrays A[k] and B[k] the index representing the absolute value of the difference from the mean. Now the task is to just fill up the arrays and choose k elements by reading the first k non empty values of the arrays reading A[i] and B[i] before A[i+1] and B[i+1]. This can be done in O(n) time.

查看更多
欢心
7楼-- · 2019-03-11 05:49
med=Select(A,1,n,n/2)   //finds the median

for i=1 to n
   B[i]=mod(A[i]-med)

q=Select(B,1,n,k) //get the kth smallest difference

j=0
for i=1 to n
   if B[i]<=q 
     C[j]=A[i] //A[i], the real value should be assigned instead of B[i] which is only the difference between A[i] and median.
       j++
return C
查看更多
登录 后发表回答