How do I get characters common to two vectors in C

2020-08-26 03:46发布

问题:

I am trying to compare two vector objects, and return a single vector containing all the chars which appear in both vectors.

How would I go about this without writing some horribly complex manual method which compares every char in the first vector to every char in the second vector and using an if to add it to a third vector (which would be returned) if they match.

Maybe my lack of real experience with vectors is making me imagine this will be harder than it really is, but I suspect there is some simplier way which I have been unable to find through searching.

回答1:

I think you're looking for std::set_intersection. The source vectors have to be sorted though. If you don't care about the order of your output vector, you could always run it on sorted copies of your source vectors.

And BTW, the manual naive way isn't horribly complex. Given two source vectors s1 and s2, and a destination vector dest, you could write something that looks like this:

for (std::vector<char>::iterator i = s1.begin(); i != s1.end(); ++i)
{
    if (std::find(s2.begin(), s2.end(), *i) != s2.end())
    {
        dest.push_back(*i);
    }
}

You have a lot of options for the find step depending on your choice of data structure.



回答2:

If I had to do this on two unsorted vectors (without library help), I think I'd add all the elements of one to a hashtable then iterate through the second looking up each--should be more efficient than sorting both lists first as well.



回答3:

int temp[5000]; // declare this globally if you're going to be 
                // doing a lot of set_intersection calls   

int main() {

  char x[]={'a','b','c','d','e'};
  char y[]={'b','c','g'};
  vector<char> v1(x,x+sizeof x/sizeof x[0]);
  vector<char> v2(y,y+sizeof y/sizeof y[0]);
  sort(v1.begin(),v1.end());
  sort(v2.begin(),v2.end());  // the vectors *must* be sorted!!!!!!

  vector<char> inter=vector<char>(temp,set_intersection(v1.begin(),v1.end(),v2.begin(),v2.end(),temp)); // inter contains {'b','c'}
  int cnt=set_intersection(v1.begin(),v1.end(),v2.begin(),v2.end(),temp) - temp;    // cnt=2

  for(int i = 0; i < (int)inter.size(); ++i) {
    cout<<inter[i]<<" ";
  }
  cout<<endl;

  return 0;
}


回答4:

Use set_intersection. Here's a working sample:

#include <cstdlib>
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>

using namespace std;

int main()
{
    vector<string> v1;
    v1.push_back("Mary");
    v1.push_back("had");
    v1.push_back("a");

    vector<string> v2;
    v2.push_back("a");
    v2.push_back("little");
    v2.push_back("lamb");

    sort(v1.begin(), v1.end());
    sort(v2.begin(), v2.end());

    vector<string> v3;
    set_intersection(v1.begin(), v1.end(), v2.begin(), v2.end(), back_inserter(v3));

    copy(v3.begin(), v3.end(), ostream_iterator<string>(cout, "\r\n"));
    return 0;
}


回答5:

This doesn't extend well beyond standard char type (maybe to unicode, depending on application), but if you're interested in doing this in O(n) time, this should work.


#include <vector>
#include <string>
#include <iostream>

std::vector<char> intersect(const std::vector<bool>& x,
                            const std::vector<bool>& y)
{
    std::vector<char> rv;

    std::vector<bool>::const_iterator ix, iy;
    size_t i;

    for (i=0, ix = x.begin(), iy = y.begin();
         ix != x.end() && iy != y.end();
         ++i, ++ix, ++iy)
        if (*ix && *iy) rv.push_back( (char) i);

    return rv;
}

std::vector<bool> poll(const std::vector<char>& x)
{
    std::vector<bool> rv(256, false);

    for (std::vector<char>::const_iterator i = x.begin(); i != x.end(); ++i)
        rv[*i] = true;

    return rv;
}

std::vector<char> build(const std::string& val)
{
    std::vector<char> rv;

    for (size_t i = 0; i < val.size(); ++i)
        rv.push_back(val[i]);

    return rv;
}

int main(int argc, char *argv[])
{
    std::vector<char> x1 = build("The Quick Brown Fox Jumps Over The Lazy Dog");
    std::vector<char> x2 = build("Oh give me a home where the buffalo roam");

    std::vector<char> intersection = intersect(poll(x1), poll(x2));

    for (std::vector<char>::iterator i=intersection.begin();
            i != intersection.end(); ++i)
        std::cout << *i;

    std::cout << std::endl;

    return 0;
}


回答6:

Since it turns out from your later question you only actually care about 26 characters:

std::bitset<26> in;
for (std::vector<char>::iterator it = first.begin(); it != first.end(); ++it) {
    in[*it - 'a'] = true;
}
for (std::vector<char>::iterator it = second.begin(); it != second.end(); ++it) {
    if (in[*it - 'a']) {
        result.push_back(*it);
        // this line is only needed if 'second' can contain duplicates
        in[*it - 'a'] = false;
    }
}

In fact a bitset<UCHAR_MAX> is small on almost all architectures. Just watch out for those DSPs with 32-bit chars, and be cautious adapting this technique to wchar_t.

With BOOST_FOREACH, the code even looks reasonable:

assert(UCHAR_MAX <= 512 && "What kind of crazy machine is this?");
std::bitset<UCHAR_MAX> in;

BOOST_FOREACH(unsigned char c, first) {
    in[c] = true;
}

BOOST_FOREACH(unsigned char c, second) {
    if (in[c]) {
        result.push_back(c);
        // this line is only needed if 'second' can contain duplicates
        in[c] = false;
    }
}


回答7:

Maybe you should be using std::strings instead of vectors, if you have characters in them? Strings have plenty of functionality for searching etc.