可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Suppose we have a vector/array in C++ and we wish to count which of these N elements has maximum repetitive occurrences and output the highest count. Which algorithm is best suited for this job.
example:
int a = { 2, 456, 34, 3456, 2, 435, 2, 456, 2}
the output is 4 because 2 occurs 4 times. That is the maximum number of times 2 occurs.
回答1:
Sort the array and then do a quick pass to count each number. The algorithm has O(N*logN) complexity.
Alternatively, create a hash table, using the number as the key. Store in the hashtable a counter for each element you've keyed. You'll be able to count all elements in one pass; however, the complexity of the algorithm now depends on the complexity of your hasing function.
回答2:
Optimized for space:
Quicksort (for example) then iterate over the items, keeping track of largest count only.
At best O(N log N).
Optimized for speed:
Iterate over all elements, keeping track of the separate counts.
This algorithm will always be O(n).
回答3:
If you have the RAM and your values are not too large, use counting sort.
回答4:
A possible C++ implementation that makes use of STL could be:
#include <iostream>
#include <algorithm>
#include <map>
// functor
struct maxoccur
{
int _M_val;
int _M_rep;
maxoccur()
: _M_val(0),
_M_rep(0)
{}
void operator()(const std::pair<int,int> &e)
{
std::cout << "pair: " << e.first << " " << e.second << std::endl;
if ( _M_rep < e.second ) {
_M_val = e.first;
_M_rep = e.second;
}
}
};
int
main(int argc, char *argv[])
{
int a[] = {2,456,34,3456,2,435,2,456,2};
std::map<int,int> m;
// load the map
for(unsigned int i=0; i< sizeof(a)/sizeof(a[0]); i++)
m [a[i]]++;
// find the max occurence...
maxoccur ret = std::for_each(m.begin(), m.end(), maxoccur());
std::cout << "value:" << ret._M_val << " max repetition:" << ret._M_rep << std::endl;
return 0;
}
回答5:
a bit of pseudo-code:
//split string into array firts
strsplit(numbers) //PHP function name to split a string into it's components
i=0
while( i < count(array))
{
if(isset(list[array[i]]))
{
list[array[i]]['count'] = list + 1
}
else
{
list[i]['count'] = 1
list[i]['number']
}
i=i+1
}
usort(list) //usort is a php function that sorts an array by its value not its key, Im assuming that you have something in c++ that does this
print list[0]['number'] //Should contain the most used number
回答6:
The hash algorithm (build count[i] = #occurrences(i) in basically linear time) is very practical, but is theoretically not strictly O(n) because there could be hash collisions during the process.
An interesting special case of this question is the majority algorithm, where you want to find an element which is present in at least n/2 of the array entries, if any such element exists.
Here is a quick explanation, and a more detailed explanation of how to do this in linear time, without any sort of hash trickiness.
回答7:
If the range of elements is large compared with the number of elements, I would, as others have said, just sort and scan. This is time n*log n and no additional space (maybe log n additional).
THe problem with the counting sort is that, if the range of values is large, it can take more time to initialize the count array than to sort.
回答8:
Here's my complete, tested, version, using a std::tr1::unordered_map
.
I make this approximately O(n). Firstly it iterates through the n input values to insert/update the counts in the unordered_map
, then it does a partial_sort_copy
which is O(n). 2*O(n) ~= O(n).
#include <unordered_map>
#include <vector>
#include <algorithm>
#include <iostream>
namespace {
// Only used in most_frequent but can't be a local class because of the member template
struct second_greater {
// Need to compare two (slightly) different types of pairs
template <typename PairA, typename PairB>
bool operator() (const PairA& a, const PairB& b) const
{ return a.second > b.second; }
};
}
template <typename Iter>
std::pair<typename std::iterator_traits<Iter>::value_type, unsigned int>
most_frequent(Iter begin, Iter end)
{
typedef typename std::iterator_traits<Iter>::value_type value_type;
typedef std::pair<value_type, unsigned int> result_type;
std::tr1::unordered_map<value_type, unsigned int> counts;
for(; begin != end; ++begin)
// This is safe because new entries in the map are defined to be initialized to 0 for
// built-in numeric types - no need to initialize them first
++ counts[*begin];
// Only need the top one at this point (could easily expand to top-n)
std::vector<result_type> top(1);
std::partial_sort_copy(counts.begin(), counts.end(),
top.begin(), top.end(), second_greater());
return top.front();
}
int main(int argc, char* argv[])
{
int a[] = { 2, 456, 34, 3456, 2, 435, 2, 456, 2 };
std::pair<int, unsigned int> m = most_frequent(a, a + (sizeof(a) / sizeof(a[0])));
std::cout << "most common = " << m.first << " (" << m.second << " instances)" << std::endl;
assert(m.first == 2);
assert(m.second == 4);
return 0;
}
回答9:
It wil be in O(n)............ but the thing is the large no. of array can take another array with same size............
for(i=0;i
mar=count[o];
index=o;
for(i=0;i
then the output will be......... the element index is occured for max no. of times in this array........
here a[] is the data array where we need to search the max occurance of certain no. in an array.......
count[] having the count of each element..........
Note : we alrdy knw the range of datas will be in array..
say for eg. the datas in that array ranges from 1 to 100....... then have the count array of 100 elements to keep track, if its occured increament the indexed value by one........