Get three most occuring word with their count valu

2019-07-25 15:02发布


My below code gives me most occurring word from string. I wan to get get three most occuring words from vector with their count value. Any help?

I have used vector and unordered_map. In last portion of code I got most occuring word from vector.

int main(int argc,char *argv[])
        typedef std::unordered_map<std::string,int> occurrences;
        occurrences s1;
        std::string input = argv[1];

        std::istringstream iss(std::move(input));
        std::vector<std::string> most;
        int max_count = 0,second=0,third=0;

//Here I get max_count, 2nd highest and 3rd highest count value 
       while (iss >> input)
            int tmp = ++s1[input];
            if (tmp == max_count)
            else if (tmp > max_count)
                max_count = tmp;
                third = second;
                second = max_count;
            else if (tmp > second)
                third = second;
                second = tmp;
            else if (tmp > third)
                third = tmp;

//I have not used max_count, second, third below. I dont know how to access them for my purpose

      //Print each word with it's occurenece. This works fine 
      for (occurrences::const_iterator it = s1.cbegin();it != s1.cend(); ++it)
            std::cout << it->first << " : " << it->second << std::endl;;

      //Prints word which occurs max time. **Here I want to print 1st highest,2nd highest,3rd highest occuring word with there occurrence.  How to do?**
      std::cout << std::endl << "Maximum Occurrences" << std::endl;
        for (std::vector<std::string>::const_iterator it = most.cbegin(); it != most.cend(); ++it)
            std::cout << *it << std::endl;

       return 0;

Any idea to get 3 most occuring word?


I'd prefer to use a std::map<std::string, int> instead

Use this as a source map, insert values from a std::vector<std::string>

Now create multimap, a flip version of source map with std::greater<int> as Comparator

This final map has top three value as most frequent used words

Example :


int main()
 std::vector<std::string> most { "lion","tiger","kangaroo",
std::map<std::string, int> src;
for(auto x:most)

std::multimap<int,std::string,std::greater<int> > dst;

std::transform(src.begin(), src.end(), std::inserter(dst, dst.begin()), 
                   [] (const std::pair<std::string,int> &p) {
                   return std::pair<int,std::string>(p.second, p.first);

std::multimap<int,std::string>::iterator it = dst.begin();

 for(int count = 0;count<3 && it !=dst.end();++it,++count)




It is easier and cleaner to use a heap to store the three most occuring words. It also is easily extensible to a larger number of most occuring words.


If I wanted to know the n most occurring words, I'd have an n element array, iterate over the list of the words, and store the ones that make it into my top n into the array (dropping the lowest one).

标签: c++ vector