I have several data that looks like this:
Vector1_elements = T,C,A
Vector2_elements = C,G,A
Vector3_elements = C,G,T
..... up to ...
VectorK_elements = ...
#Note also that the member of each vector is always 3.
What I want to do is to create all combination of elements in Vector1 through out VectorK. Hence in the end we hope to get this output (using Vector1,2,3):
TCC
TCG
TCT
TGC
TGG
TGT
TAC
TAG
TAT
CCC
CCG
CCT
CGC
CGG
CGT
CAC
CAG
CAT
ACC
ACG
ACT
AGC
AGG
AGT
AAC
AAG
AAT
The problem I am having now is that the following code of mine does that by hardcoding the loops. Since number of Vectors can be varied, we need a flexible way to get the same result. Is there any?
This code of mine can only handle up to 3 Vectors (hardcoded):
#include <iostream>
#include <vector>
#include <fstream>
#include <sstream>
using namespace std;
int main ( int arg_count, char *arg_vec[] ) {
vector <string> Vec1;
Vec1.push_back("T");
Vec1.push_back("C");
Vec1.push_back("A");
vector <string> Vec2;
Vec2.push_back("C");
Vec2.push_back("G");
Vec2.push_back("A");
vector <string> Vec3;
Vec3.push_back("C");
Vec3.push_back("G");
Vec3.push_back("T");
for (int i=0; i<Vec1.size(); i++) {
for (int j=0; j<Vec2.size(); j++) {
for (int k=0; k<Vec1.size(); k++) {
cout << Vec1[i] << Vec2[i] << Vec3[k] << endl;
}
}
}
return 0;
}
You can implement this like an odometer, which leads to the following (works for different-sized vectors):
Say you have K vectors in an array v:
v[0], v[1], ... v[K-1]
Keep an array of iterators
it
(size K) into your vectors, starting withit[i] = v[i].begin()
. Keep incrementingit[K-1]
in a loop. When any iterator hits theend()
of the corresponding vector, you wrap it around tobegin()
and increment the previous iterator also (so whenit[K-1]
wraps around, you incrementit[K-2]
). These increments may "cascade" so you should do them in a loop backwards. Whenit[0]
wraps around, you're done (so your loop condition could be something likewhile (it[0] != v[0].end())
Putting all that together, the loop that does the work (after setting up the iterators) should be something like:
If you're interested in complexity, the number of iterator increments that get performed is easy to calculate. For simplicity here I'll assume each vector is the same length N. The total number of combinations is NK. The last iterator gets incremented each time, so that is NK, and moving back through the iterators this count gets divided by N each time, so we have NK + NK-1 + ... N1; this sum equals N(NK - 1)/(N-1) = O(NK). This also means that the amortized cost per-combination is O(1).
Anyway, in short, treat it like an odometer spinning its digit wheels.
Combining three vectors is essentially the same as first combining two vectors, and then combining the third one with the result.
So it all boils down to writing a function that can combine two vectors.
And then something like:
and so on for every vector you need combined.
Note that it's more the "C++ way" to use input and output iterators i.s.o. passing vectors around, and much more efficient. In the above version the vector gets copied over and over...
I simply used vectors to stay closer to your original code and, hopefully, make more sense to you.
The basic difficulty with recursion here is that you need to keep track of the entire list of indices (or else construct the string incrementally, as another question points out).
An expedient way to handle this problem without constructing additional objects inside the loops is to hand your recursive function a vector of indices, of the same length as the vector of vectors: