Efficiently computing vector combinations

2019-02-13 07:59发布

站内文章 / C++

13 0

乱世女痞

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm working on a research problem out of curiosity, and I don't know how to program the logic that I've in mind. Let me explain it to you:

I've four vectors, say for example,

v1 = 1 1 1 1
v2 = 2 2 2 2
v3 = 3 3 3 3
v4 = 4 4 4 4

Now what I want to do is to add them combination-wise, that is,

v12 = v1+v2
v13 = v1+v3
v14 = v1+v4
v23 = v2+v3
v24 = v2+v4
v34 = v3+v4

Till this step it is just fine. The problem is now I want to add each of these vectors one vector from v1, v2, v3, v4 which it hasn't added before. For example:

v3 and v4 hasn't been added to v12, so I want to create v123 and v124. Similarly for all the vectors like,

v12 should become:
v123 = v12+v3
v124 = v12+v4

v13 should become:
v132 // This should not occur because I already have v123
v134

v14 should become:
v142 // Cannot occur because I've v124 already
v143 // Cannot occur

v23 should become:
v231 // Cannot occur
v234 ... and so on.

It is important that I do not do all at one step at the start. Like for example, I can do (4 choose 3) 4C3 and finish it off, but I want to do it step by step at each iteration.

How do I program this?

P.S.: I'm trying to work on an modified version of an apriori algorithm in data mining.

回答1:

In C++, given the following routine:

template <typename Iterator>
inline bool next_combination(const Iterator first,
                                   Iterator k,
                             const Iterator last)
{
   /* Credits: Thomas Draper */
   if ((first == last) || (first == k) || (last == k))
      return false;
   Iterator itr1 = first;
   Iterator itr2 = last;
   ++itr1;
   if (last == itr1)
      return false;
   itr1 = last;
   --itr1;
   itr1 = k;
   --itr2;
   while (first != itr1)
   {
      if (*--itr1 < *itr2)
      {
         Iterator j = k;
         while (!(*itr1 < *j)) ++j;
         std::iter_swap(itr1,j);
         ++itr1;
         ++j;
         itr2 = k;
         std::rotate(itr1,j,last);
         while (last != j)
         {
            ++j;
            ++itr2;
         }
         std::rotate(k,itr2,last);
         return true;
      }
   }
   std::rotate(first,k,last);
   return false;
}

You can then proceed to do the following:

int main()
{
   unsigned int vec_idx[] = {0,1,2,3,4};

   const std::size_t vec_idx_size = sizeof(vec_idx) / sizeof(unsigned int);

   {
      // All unique combinations of two vectors, for example, 5C2
      std::size_t k = 2;
      do
      {
         std::cout << "Vector Indicies: ";
         for (std::size_t i = 0; i < k; ++i)
         {
           std::cout << vec_idx[i] << " ";
         }
      }
      while (next_combination(vec_idx,
                              vec_idx + k,
                              vec_idx + vec_idx_size));
   }

   std::sort(vec_idx,vec_idx + vec_idx_size);

   {
      // All unique combinations of three vectors, for example, 5C3
      std::size_t k = 3;
      do
      {
         std::cout << "Vector Indicies: ";
         for (std::size_t i = 0; i < k; ++i)
         {
           std::cout << vec_idx[i] << " ";
         }
      }
      while (next_combination(vec_idx,
                              vec_idx + k,
                              vec_idx + vec_idx_size));
   }

   return 0;
}

**Note 1:* Because of the iterator oriented interface for the next_combination routine, any STL container that supports forward iteration via iterators can also be used, such as std::vector, std::deque and std::list just to name a few.

Note 2: This problem is well suited for the application of memoization techniques. In this problem, you can create a map and fill it in with vector sums of given combinations. Prior to computing the sum of a given set of vectors, you can lookup to see if any subset of the sums have already been calculated and use those results. Though you're performing summation which is quite cheap and fast, if the calculation you were performing was to be far more complex and time consuming, this technique would definitely help bring about some major performance improvements.

回答2:

I think this problem can be solved by marking which combination har occured.

My first thought is that you may use a 3-dimension array to mark what combination has happened. But that is not very good.

How about a bit-array (such as an integer) for flagging? Such as:

Num 1 = 2^0 for vector 1
Num 2 = 2^1 for vector 2
Num 4 = 2^2 for vector 3
Num 8 = 2^3 for vector 4

When you make a compose, just add all the representative number. For example, vector 124 will have the value: 1 + 2 + 8 = 11. This value is unique for every combination.

This is just my thought. Hope it helps you someway.

EDIT: Maybe I'm not be clear enough about my idea. I'll try to explain it a bit clearer:

1) Assign for each vector a representative number. This number is the id of a vector, and it's unique. Moreover, the sum of every sub-set of those number is unique, means that if we have sum of k representative number is M; we can easily know that which vectors take part in the sum.

We do that by assign: 2^0 for vector 1; 2^1 for vector 2; 2^2 for vector 3, and so on...

With every M = sum (2^x + 2^y + 2^z + ... ) = (2^x OR 2^y OR 2^z OR ...). We know that the vector (x + 1), (y + 1), (z +1) ... take part in the sum. This can easily be checked by express the number in binary mode.

For example, we know that:

2^0 = 1 (binary) 2^1 = 10 (binary) 2^2 = 100 (binary) ...

So that if we have the sum is 10010 (binary), we know that vector(number: 10) and vector(number: 10000) join in the sum.

And for the best, the sum here can be calculated by "OR" operator, which is also easily understood if you express the number in binary.

2) Utilizing the above facts, every time before you count the sum of your vector, you can add/OR their representative number first. And you can keep track them in something like a lookup array. If the sum already exists in the lookup array, you can omit it. By that you can solve the problem.

回答3:

Maybe I am misunderstanding, but isn't this equivalent to generating all subsets (power set) of 1, 2, 3, 4 and then for each element of the power set, summing the vector? For instance:

//This is pseudo C++ since I'm too lazy to type everything
//push back the vectors or pointers to vectors, etc.
vector< vector< int > > v = v1..v4;

//Populate a vector with 1 to 4
vector< int > n = 1..4

//Function that generates the power set {nil, 1, (1,2), (1,3), (1,4), (1,2,3), etc.
vector< vector < int > > power_vec = generate_power_set(n);

//One might want to make a string key by doing a Perl-style join of the subset together by a comma or something...
map< vector < int >,vector< int > > results;
//For each subset, we sum the original vectors together
for subset_iter over power_vec{
    vector<int> result;
    //Assumes all the vecors same length, can be modified carefully if not.
    result.reserve(length(v1));
    for ii=0 to length(v1){
        for iter over subset from subset_iter{
            result[ii]+=v[iter][ii];
        }
    }
    results[*subset_iter] = result;
}

If that is the idea you had in mind, you still need a power set function, but that code is easy to find if you search for power set. For example, Obtaining a powerset of a set in Java.

回答4:

Maintain a list of all for choosing two values.
Create a vector of sets such that the set consists of elements from the original vector with the 4C2 elements. Iterate over the original vectors and for each one, add/create a set with elements from step 1. Maintain a vector of sets and only if the set is not present, add the result to the vector.
Sum up the vector of sets you obtained in step 2.

But as you indicated, the easiest is 4C3.

Here is something written in Python. You can adopt it to C++

import itertools

l1 = ['v1','v2','v3','v4']
res = []
for e in itertools.combinations(l1,2):
    res.append(e)

fin = []
for e in res:
    for l in l1:
        aset = set((e[0],e[1],l))
        if aset not in fin and len(aset) == 3:
            fin.append(aset)
print fin

This would result:

[set(['v1', 'v2', 'v3']), set(['v1', 'v2', 'v4']), set(['v1', 'v3', 'v4']), set(['v2', 'v3', 'v4'])]

This is the same result as 4C3.

标签： c++ algorithm vector combinations

乱世女痞

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~

Efficiently computing vector combinations

问题:

回答1:

回答2:

回答3:

回答4:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮