This is an algorithmic problem. If I miss any existing function in Python that helps, please give a shout.
Given a set s
of n
elements, we can use itertools.combinations()
function in Python to find all the unique k-element subsets. Let's call the set containing all these subsets S
. Notice that each such subset has k
distinct elements.
The problem is 2-step. First, given these k-distinct-element subsets, I want to compose (some of) them such that (a composition is simply a superset of some subsets):
The intersection between any two subsets within a composition is empty
The union between all the subsets in a composition gives exactly the original set s
Second, I want to find the compositions that:
Do not share any subset
Their union gives exactly the S
, i.e. the set of all the k
-element subsets
As a concrete example here, consider the original set s = {a, b, c, d}
and k = 2
, we then will have the following three compositions/supersets:
{{a, b}, {c, d}}, {{a, c}, {b, d}}, {{a, d}, {b, c}}
Obviously, the size of s
can be large and k >= 2
, so an efficient (especially the speed) algorithm is needed here.
P.S. I phrase the problem in 2 steps, but it may well be that an efficient algorithm attacks the problem from a different angle.
I implemented the integral maximum flow construction that's used to prove Baranyai's theorem. More details in your favorite textbook covering factors of a complete hypergraph.
from collections import defaultdict
from fractions import Fraction
from math import factorial
from operator import itemgetter
def binomial(n, k):
return factorial(n) // (factorial(k) * factorial(n - k))
def find_path(graph, s, t):
stack = [s]
predecessor = {s: t}
while stack:
v = stack.pop()
for u in graph[v]:
if u not in predecessor:
stack.append(u)
predecessor[u] = v
assert t in predecessor
path = [t]
while path[-1] != s:
path.append(predecessor[path[-1]])
path.reverse()
return path
def round_flow(flow):
while True:
capacities = []
for (u, v), x in flow.items():
z = x - x.numerator // x.denominator
if z:
capacities.append(((v, u), z))
capacities.append(((u, v), 1 - z))
if not capacities:
break
(t, s), delta = min(capacities, key=itemgetter(1))
graph = defaultdict(list)
for (v, u), z in capacities:
if (v, u) not in [(s, t), (t, s)]:
graph[v].append(u)
path = find_path(graph, s, t)
for i, v in enumerate(path):
u = path[i - 1]
if (u, v) in flow:
flow[(u, v)] += delta
else:
flow[(v, u)] -= delta
def baranyai(n, k):
m, r = divmod(n, k)
assert not r
M = binomial(n - 1, k - 1)
partition = [[()] * m for i in range(M)]
for l in range(n):
flow = defaultdict(Fraction)
for i, A_i in enumerate(partition):
for S in A_i:
flow[(i, S)] += Fraction(k - len(S), n - l)
round_flow(flow)
next_partition = []
for i, A_i in enumerate(partition):
next_A_i = []
for S in A_i:
if flow[(i, S)]:
next_A_i.append(S + (l,))
flow[(i, S)] -= 1
else:
next_A_i.append(S)
next_partition.append(next_A_i)
partition = next_partition
assert len(partition) == M
classes = set()
for A in partition:
assert len(A) == m
assert all(len(S) == k for S in A)
assert len({x for S in A for x in S}) == n
classes.update(map(frozenset, A))
assert len(classes) == binomial(n, k)
return partition
if __name__ == '__main__':
print(baranyai(9, 3))
print(baranyai(20, 2))