I'm trying to implement an algorithm to get all combinations of k elements out of a set of n elements where the difference between two consecutive combinations are maximized (so kind of reverse Gray codes). In other words, the combinations should be ordered to avoid elements from appearing twice in a row, and so that no element is unnecessarily discriminated.
Ideally, the algorithm would also NOT pre-calculate all combinations and store them into memory, but rather deliver combinations on demand. I have searched extensively for this and found a few detailed answers such as https://stackoverflow.com/a/127856/1226020, but I can't seem to apply this. Also, many of the articles linked in that answer are paid content.
To illustrate what I mean:
From a set of [0, 1, 2, 3, 4], find all combinations of two elements. Using a simple algorithm that tries to increment the right-most element until no longer possible, then moving left, incrementing the previous digit etc, I get the following results:
[0, 1]
[0, 2]
[0, 3]
[0, 4]
[1, 2]
[1, 3]
[1, 4]
[2, 3]
[2, 4]
[3, 4]
I produce this result with the following Java code:
public class CombinationGenerator {
private final int mNrElements;
private final int[] mCurrentCombination;
public CombinationGenerator(int n, int k) {
mNrElements = n;
mCurrentCombination = new int[k];
initElements(0, 0);
// fake initial state in order not to miss first combination below
mCurrentCombination[mCurrentCombination.length - 1]--;
}
private void initElements(int startPos, int startValue) {
for (int i = startPos; i < mCurrentCombination.length; i++) {
mCurrentCombination[i] = i + startValue - startPos;
}
}
public int[] getNextCombination() {
for (int i = 0; i < mCurrentCombination.length; i++) {
int pos = mCurrentCombination.length - 1 - i;
if (mCurrentCombination[pos] < mNrElements - 1 - i) {
initElements(pos, mCurrentCombination[pos] + 1);
return mCurrentCombination;
}
}
return null;
}
public static void main(String[] args) {
CombinationGenerator cg = new CombinationGenerator(5, 2);
int[] c;
while ((c = cg.getNextCombination()) != null) {
System.out.println(Arrays.toString(c));
}
}
}
This is not what I want, because I want each consecutive combination to be as different as possible from the previous one. Currently, element "1" appears four times in a row, and then never again. For this particular example, one solution would be:
[0, 1]
[2, 3]
[0, 4]
[1, 2]
[3, 4]
[0, 2]
[1, 3]
[2, 4]
[0, 3]
[1, 4]
I have indeed managed to accomplish this result for this particular case by applying a sorting algoritm after the combinations are generated, but this does not fulfill my requirement of on-demand combination generation as the entire set of combinations has to be generated at once, then sorted and kept in memory. I'm not sure it works for arbitrary k and n values either. And finally, I'm pretty sure it's not the most efficient way as the sorting algorithm basically loops through the set of combinations trying to find one sharing no elements with the previous combination. I also considered keeping a table of "hit counts" for each element and use that to always get the next combination containing the lowest combined hit count. My somewhat empirical conclusion is that it will be possible to avoid elements from completely appearing in two consecutive combinations if n > 2k. Otherwise, it should at least be possible to avoid elements from appearing more than twice in a row etc.
You could compare this problem to what is achieved for k = 2 using a standard round-robin scheme for football games etc, but I need a solution for arbitrary values of k. We can imagine this to be a tournament of some sort of game where we have n players that are to play against all other players in a set of games, each game holding k players. Players should as far as possible not have to play two games in a row, but should also not have to wait unnecessarily long between two game appearances.
Any pointers on how to solve this either with a reliable sorting algorithm post generation, or - preferably - on-demand, would be awesome!
Note: Let's typically assume that n <= 50, k <= 5
Thanks
While I really appreciate the efforts from @tucuxi and @David Eisenstadt, I found some issues with the Hamiltonian approach, namely that it didn't solve for certain values of n and k and certain elements were also unnecessarily discriminated.
I decided to give the ideas listed in my question a go (hit count table) and it seems to give pretty good results. This solution however also requires a post-generation sorting which does not fulfill the on-demand bonus requirement. For reasonable n and k this should be feasible though.
Admittedly, I've found that my algorithm sometimes seems to favour combinations that result in consecutive appearances for one element, which can probably be tuned. But as of now, my algorithm might be inferior to tucuxi's for specifically. It does however present a solution for every n, k and seems to discriminate elements less.
My code is listed below.
Thanks again.
Results for :
Results for :
Quick & dirty working code working on @DavidEisenstat's suggestion:
Output for above code (7,4); distance (as length - size_of_intersection) is always 3; trying to use 4 would lead to a disconnected graph:
Missing bits of code:
Distance: (not efficient at all, but dwarfed by cost of hamiltonian calculation)
And for finding the hamiltonian, I modified code from http://www.sanfoundry.com/java-program-find-hamiltonian-cycle-unweighted-graph/:
Be warned: this will be very slow for large numbers of combinations...