I have an algorithm problem. I am trying to find all unique subset of values from a larger set of values.
For example say I have the set {1,3,7,9}
. What algorithm can I use to find these subsets of 3?
{1,3,7}
{1,3,9}
{1,7,9}
{3,7,9}
Subsets should not repeat, and order is unimportant, set {1,2,3} is the same as set {3,2,1} for these purposes. Psudocode (or the regular kind) is encouraged.
A brute force approach is obviously possible, but not desired.
For example such a brute force method would be as follows.
for i = 0 to size
for j = i + 1 to size
for k = j + 1 to size
subset[] = {set[i],set[j],set[k]}
Unfortunately this requires an additional loop for each element desired in the subset, which is undesirable if, for example, you want a subset of 8 elements.
Some Java code using recursion.
The basic idea is to try to swap each element with the current position and then recurse on the next position (but we also need startPos
here to indicate what the last position that we swapped with was, otherwise we'll get a simple permutation generator). Once we've got enough elements, we print all those and return.
static void subsets(int[] arr, int pos, int depth, int startPos)
{
if (pos == depth)
{
for (int i = 0; i < depth; i++)
System.out.print(arr[i] + " ");
System.out.println();
return;
}
for (int i = startPos; i < arr.length; i++)
{
// optimization - not enough elements left
if (depth - pos + i > arr.length)
return;
// swap pos and i
int temp = arr[pos];
arr[pos] = arr[i];
arr[i] = temp;
subsets(arr, pos+1, depth, i+1);
// swap pos and i back - otherwise things just gets messed up
temp = arr[pos];
arr[pos] = arr[i];
arr[i] = temp;
}
}
public static void main(String[] args)
{
subsets(new int[]{1,3,7,9}, 0, 3, 0);
}
Prints:
1 3 7
1 3 9
1 7 9
3 7 9
A more detailed explanation (through example):
First things first - in the above code, an element is kept in the same position by performing a swap with itself - it doesn't do anything, just makes the code a bit simpler.
Also note that at each step we revert all swaps made.
Say we have input 1 2 3 4 5
and we want to find subsets of size 3.
First we just take the first 3 elements - 1 2 3
.
Then we swap the 3
with 4
and 5
respectively,
and the first 3 elements gives us 1 2 4
and 1 2 5
.
Note that we've just finished doing all sets containing 1
and 2
together.
Now we want sets of the form 1 3 X
, so we swap 2
and 3
and get 1 3 2 4 5
. But we already have sets containing 1
and 2
together, so here we want to skip 2
. So we swap 2
with 4
and 5
respectively, and the first 3 elements gives us 1 3 4
and 1 3 5
.
Now we swap 2
and 4
to get 1 4 3 2 5
. But we want to skip 3
and 2
, so we start from 5
. We swap 3
and 5
, and the first 3 elements gives us 1 4 5
.
And so on.
Skipping elements here is perhaps the most complex part. Note that whenever we skip elements, it just involves continuing from after the position we swapped with (when we swapped 2
and 4
, we continued from after the 4
was). This is correct because there's no way an element can get to the left of the position we're swapping with without having been processed, nor can a processed element get to the right of that position, because we process all the elements from left to right.
Think in terms of the for-loops
It's perhaps the simplest to think of the algorithm in terms of for-loops.
for i = 0 to size
for j = i + 1 to size
for k = j + 1 to size
subset[] = {set[i],set[j],set[k]}
Each recursive step would represent a for-loop.
startPos
is 0
, i+1
and j+1
respectively.
depth
is how many for-loops there are.
pos
is which for-loop we're currently at.
Since we never go backwards in a deeper loop, it's safe to use the start of the array as storage for our elements, as long as we revert the changes when we're done with an iteration.
If you are interested only in subsets of size 3, then this can be done using three simple nested for loops.
for ( int i = 0; i < arr.size(); i++ )
for ( int j = i+1; j < arr.size(); j++ )
for ( int k = j+1; k < arr.size(); k++ )
std::cout << "{ " << arr[i] <<"," << arr[j] <<"," << arr[k] <<" }";
For a more general case you will have to use recursion.
void recur( set<int> soFar, set<int> remaining, int subSetSize ) {
if (soFar.size() == subSetSize) {
print soFar;
return;
}
for ( int I = 0; I < remaining.size(); I++ ) {
//take out Ith element from remaining and push it in soFar.
// recur( newSofar, newRemaining, subSetSize);
}
}