This problem is 4-11 of Skiena. The solution to finding majority elements - repeated more than half times is majority algorithm. Can we use this to find all numbers repeated n/4 times?
相关问题
- What is the best way to do a search in a large fil
- How to get the maximum of more than 2 numbers in V
- Faster loop: foreach vs some (performance of jsper
- Convert Array to custom object list c#
- pick a random item from a javascript array
相关文章
- Numpy matrix of coordinates
- What is the complexity of bisect algorithm?
- What are the problems associated to Best First Sea
- Coin change DP solution to keep track of coins
- PHP: Can an array have an array as a key in a key-
- Algorithm for partially filling a polygonal mesh
- Robust polygon normal calculation
- Accessing an array element when returning from a f
As you didnt mention space complexity , one possible solution is using hashtable for the elements which maps to count then you can just increment count if the element is found.
Find the majority element that appears
n/2 times
by Moore-Voting AlgorithmSee method 3 of the given link for Moore's Voting Algo (http://www.geeksforgeeks.org/majority-element/).
Time:O(n)
Now after finding majority element, scan the array again and
remove the majority element
or make it-1.
Time:O(n)
Now apply Moore Voting Algorithm on the remaining elements of array (but ignore -1 now as it has already been included earlier). The new majority element appears
n/4 times.
Time:O(n)
Total Time:O(n)
Extra Space:O(1)
You can do it for element appearing more than n/8,n/16,.... times
EDIT:
There may exist a case when there is no majority element in the array:
For e.g. if the input arrays is
{3, 1, 2, 2, 1, 2, 3, 3}
then the output should be[2, 3]
.Given an array of of size n and a number k, find all elements that appear more than n/k times
See this link for the answer: https://stackoverflow.com/a/24642388/3714537
References:
http://www.cs.utexas.edu/~moore/best-ideas/mjrty/
See this paper for a solution that uses constant memory and runs in linear time, which will find 3 candidates for elements that occur more than n/4 times. Note that if you assume that your data is given as a stream that you can only go through once, this is the best you can do -- you have to go through the stream one more time to test each of the 3 candidates to see if it occurs more than n/4 times in the stream. However, if you assume a priori that there are 3 elements that occur more than n/4 times then you only need to go through the stream once so you get a linear time online algorithm (only goes through the stream once) that only requires constant storage.
Misra and Gries describe a couple approaches. I don't entirely understand their paper, but a key idea is to use a bag.
Boyer and Moore's original majority algorithm paper has a lot of incomprehensible proofs and discussion of formal verification of FORTRAN code, but it has a very good start of an explanation of how the majority algorithm works. The key concept starts with the idea that if the majority of the elements are
A
and you remove, one at a time, a copy ofA
and a copy of something else, then in the end you will have only copies ofA
. Next, it should be clear that removing two different items, neither of which isA
, can only increase the majority thatA
holds. Therefore it's safe to remove any pair of items, as long as they're different. This idea can then be made concrete. Take the first item out of the list and stick it in a box. Take the next item out and stick it in the box. If they're the same, let them both sit there. If the new one is different, throw it away, along with an item from the box. Repeat until all items are either in the box or in the trash. Since the box is only allowed to have one kind of item at a time, it can be represented very efficiently as a pair(item type, count)
.The generalization to find all items that may occur more than
n/k
times is simple, but explaining why it works is a little harder. The basic idea is that we can find and destroy groups ofk
distinct elements without changing anything. Why? Ifw > n/k
thenw-1 > (n-k)/k
. That is, if we take away one of the popular elements, and we also take awayk-1
other elements, then the popular element remains popular!Implementation: instead of only allowing one kind of item in the box, allow
k-1
of them. Whenever you see a group ofk
different items show up (that is, there arek-1
types in the box, and the one arriving doesn't match any of them), you throw one of each type in the trash, including the one that just arrived. What data structure should we use for this "box"? Well, a bag, of course! As Misra and Gries explain, if the elements can be ordered, a tree-based bag with O(log k) basic operations will give the whole algorithm a complexity of O(n log k). One point to note is that the operation of removing one of each element is expensive (I think O(k log k)), but that cost is amortized over the arrivals of those elements, so it's no big deal. Of course, if your elements are hashable rather than orderable, you can use a hash-based bag instead, which under certain common assumptions will give even better asymptotic performance (but it's not guaranteed). If your elements are drawn from a small finite set, you can guarantee that. If they can only be compared for equality, then your bag gets much more expensive and I'm pretty sure you end up with something like O(nk) instead.