I've been using the random_element()
function provided by SAGE to generate random integer partitions for a given integer (N
) that are a particular length (S
). I'm trying to generate unbiased random samples from the set of all partitions for given values of N
and S
. SAGE's function quickly returns random partitions for N (i.e. Partitions(N).random_element()
).
However, it slows immensely when adding S
(i.e. Partitions(N,length=S).random_element()
). Likewise, filtering out random partitions of N
that are of length S
is incredibly slow.
However, and I hope this helps someone, I've found that in the case when the function returns a partition of N
not matching the length S
, that the conjugate partition is often of length S. That is:
S = 10
N = 100
part = list(Partitions(N).random_element())
if len(part) != S:
SAD = list(Partition(part).conjugate())
if len(SAD) != S:
continue
This increases the rate at which partitions of length S
are found and appears to produce unbiased samples (I've examined the results against entire sets of partitions for various values of N
and S
).
However, I'm using values of N (e.g. 10,000
) and S (e.g. 300
) that make even this approach impractically slow. The comment associated with SAGE's random_element()
function admits there is plenty of room for optimization. So, is there a way to more quickly generate unbiased (i.e. random uniform) samples of integer partitions matching given values of N
and S
, perhaps, by not generating partitions that do not match S
? Additionally, using conjugate partitions works well in many cases to produce unbiased samples, but I can't say that I precisely understand why.
Finally, I have a definitively unbiased method that has a zero rejection rate. Of course, I've tested it to make sure the results are representative samples of entire feasible sets. It's very fast and totally unbiased. Enjoy.
First, a function to find the smallest maximum addend for a partition of n with s parts
Next, A function that uses a cache and memoiziation to find the number of partitions of n with s parts having x as the largest part. This is fast, but I think there's a more elegant solution to be had. e.g., Often: P(N,S,max=K) = P(N-K,S-1) Thanks to ante (https://stackoverflow.com/users/494076/ante) for helping me with this: Finding the number of integer partitions given a total, a number of parts, and a maximum summand
Finally, a function to find uniform random partitions of n with s parts, with no rejection rate! Each randomly chosen number codes for a specific partition of n having s parts.
I ran into a similar problem when I was trying to calculate the probability of the strong birthday problem.
First off, the partition function explodes when given only modest amount of numbers. You'll be returning a LOT of information. No matter which method you're using N = 10000 and S = 300 will generate ridiculous amounts of data. It will be slow. Chances are any pure python implementation you use will be equally slow or slower. Look to making a CModule.
If you want to try python the approach I took as a combination of itertools and generators to keep memory usage down. I don't seem to have my code handy anymore, but here's a good impementation:
http://wordaligned.org/articles/partitioning-with-python
EDIT:
Found my code:
Simple approach: randomly assign the integers: