Easier way to count identical strings in a vector?

2019-03-01 11:19发布

问题:

I am trying to get an output of the number of all the identical strings in a vector as part of a much larger program. After a lot of research I have managed to put something together that works but it seems messy and I was wondering if there was a better way to do it.

#include <vector>
#include <string>
#include <map>
#include <algorithm>
#include <iostream>

using namespace std;

void setMap(string i);
void addMap(string i);
map<string, int> myMap;

int main()
{
    vector<string> myVector;
    string myArray[6]={"foo","foo","bar","roo","foo","bar"};
    for (int i=0; i<6; i++)
    {
        myVector.push_back(myArray[i]);
    }
    for_each (myVector.begin(), myVector.end(), setMap);
    for_each (myVector.begin(), myVector.end(), addMap);
    for (map<string, int, less< string >>::const_iterator iter = myMap.begin();
      iter != myMap.end(); ++iter )
      cout <<iter->first<<'\t'<<iter->second<<endl;
    return 0;
}

void setMap(string i)
{
    myMap[i]=0;
}

void addMap(string i)
{
    myMap[i]++;
}

This code works fine and gives me the output I was after but I'm not that keen on having to add 2 extra functions to make it work or having to make the map global. Any hints would be gratefully received.

回答1:

Well the simplest way to not have the extra functions and not have the map as global would be to not use for_each.

for_each (myVector.begin(), myVector.end(), setMap);
for_each (myVector.begin(), myVector.end(), addMap);

becomes

map<string, int> myMap;
for (vector<string>::iterator i = myVector.begin(); i != myVector.end(); ++i)
    myMap[*i]=0;
for (vector<string>::iterator i = myVector.begin(); i != myVector.end(); ++i)
    ++myMap[*i];

Once you done that you could also remove the first loop

map<string, int> myMap;
for (vector<string>::iterator i = myVector.begin(); i != myVector.end(); ++i)
    ++myMap[*i];

since the map values will be initialised to zero anyway.

What made you think you had to use for_each anyway?



回答2:

Your setMap function is unnecessary.

Consider what this function does, should the map's key not be present.

void addMap(string i)
{
    myMap[i]++;
}

The expression myMap[i] will add a new key to your map.

Since the value type is int, this new value will be int(), which is guaranteed to be 0.



回答3:

What about this? Encapsulate the counting mechanism in a separate function for reusability.

// Iterator pair based interface
template <class Iterator>
std::map<typename Iterator::value_type,int>
count(Iterator begin, Iterator end) {
    std::map<typename Iterator::value_type,int> counts;
    for (Iterator i = begin; i != end; ++i)
        counts[*i]++;
    return counts;
}

// Sequence interface
template <class Sequence>
inline std::map<typename Sequence::value_type,int>
count(Sequence seq) {
    return count(seq.begin(), seq.end());
}

Then simply use it like this:

// C++11
for (const auto & c : count(myVector))
    cout << c->first << '\t' << c->second << endl;

// C++03
std::map<string,int> counts = count(myVector);
for (std::map<string,int>::const_iterator c = counts.begin(), e = counts.end(); c != e; ++c)
    cout << c->first << '\t' << c->second << endl;

Simple demo



回答4:

Under C++11, you can do this:

#include <string>
#include <unordered_map>
#include <iostream>

int main() {

    std::string myArray[6] = {"foo","foo","bar","roo","foo","bar"};

    std::unordered_map<std::string, size_t> m;
    for (const auto& s : myArray)
        ++m[s];

    for (const auto& p : m)
        std::cout << p.first << "\t" << p.second << std::endl;

}

This prints:

foo     3
bar     2
roo     1

This works because m[s] will automatically insert s into m if not already there.

Using std::unordered_map (a hashtable) is likely to be cheaper than std::map (a balanced tree).


You can do something very similar under C++03, except the "for each" loops shown above would be replaced by the regular "for" loops.



回答5:

#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <map>

using namespace std;

int main (int argc, char * const argv[]) {

    string myArray[]={"foo","foo","bar","roo","foo","bar"};
    int arr_length = 6;
    vector<string> myVector(myArray, myArray + arr_length);

    //Print contents of vector:
    copy(myVector.begin(), 
         myVector.end(), 
         ostream_iterator<string>(cout, " ")
    );

    cout << endl;



    map<string, int> myMap;

    vector<string>::iterator pos;
    for (pos=myVector.begin(); pos<myVector.end(); ++pos)
    {
        myMap[*pos] += 1;
    }

    map<string, int>::iterator mapPos;
    for (mapPos=myMap.begin(); mapPos != myMap.end(); ++mapPos) {
        cout << "word: " << mapPos->first << "\t"
             << "count: " << mapPos->second << endl;
    }




        return 0;
}


--output:--
foo foo bar roo foo bar 
word: bar   count: 2
word: foo   count: 3
word: roo   count: 1