Equivalence testing of n objects

2019-02-15 21:58发布

问题:

Assume that we are given 'n' objects and a subroutine that takes two inputs and says if they are equivalent or not (e.g. it can give output as 1 if they are equal).

I need to come up with an algorithm that calls the above function O(n log n) times and decides if the input has more than 'n/2' items that are equivalent to each other.

回答1:

You can use the Boyer-Moore majority voting algorithm, which does at most n-1 comparisons.

function find_majority(A)
    majority = None
    count = 0
    for a in A:
        if count == 0
            majority = a
        else if a == majority
            count += 1
        else
            count -= 1
    return majority

It returns the most common element if appears more than n/2 times in the array.

If you need to know if there is a majority item, then you can make a second pass through the array counting how many times the value returned from the find_majority function appears. This adds another n comparisons.



回答2:

Here is a classical divide and conquer solution which gives O(n log n)

split into two subsets, A1 and A2, ..., and show T(n) is O(n log n). If A has a majority element v, v must also be a majority element of A1 or A2 or both. The equivalent contra-positive restatement is immediate: (If v is <= half in each, it is <= half in the total.) If both parts have the same majority element, it is automatically the majority element for A. If one of the parts has a majority element, count the number of repetitions of that element in both parts (in O(n) time) to see if it is a majority element. If both parts have a majority, you may need to do this count for each of the two candidates, still O(n). This splitting can be done recursively. The base case is when n = 1. A recurrence relation is T(n) = 2T(n/2) + O(n), so T(n) is O(n log n) by the Master Theorem.

http://anh.cs.luc.edu/363/handouts/MajorityProblem.pdf



回答3:

Considering that the sequence is ordered you can use binary search, which takes O(log n), and since you'll have to do it for every element, and you have n elements it will take O(n*log n).