Apologies if the answer to this is readily found elsewhere. My math and stats are weak and thus I don't even know the search terms for what I'm trying to do . . .
I have b anonymous indistinguishable buckets into which I'm putting i identical items. I want to know all possible distributions and their probabilities. For example, if I have 3 buckets and 3 items, the answer I want is:
- [3,0,0] -> 1/9
- [2,1,0] -> 6/9
- [1,1,1] -> 2/9
Notice that the buckets are anonymous and thus I want identical distributions to be combined as I have done above. For example, the [2,1,0] case is actually the sum of the [2,1,0], [0,2,1], etc. cases.
In addition, I have the constraint of a max bucket size. For example, 3 balls, 3 buckets, bucketsize=2 should return:
- [2,1,0] prob=7/9
- [1,1,1] prob=2/9
This can be seen looking at a probability tree:
Insert item 1 into [0,0,0] -> [1,0,0] p=1
Insert item 2 into [1,0,0] -> [2,0,0] p=1/3 OR [1,1,0] 2/3
Insert item 3 into [2,0,0] -> [2,1,0] p=1.0
Insert item 3 into [1,1,0] -> [2,1,0] p=2/3 OR [1,1,1] p=1/3
So state [2,1,0] has two paths to it: 1/3*1 AND 2/3*2/3 = 7/9
So state [1,1,1] has one path to it: 2/3 * 1/3 = 2/9
Here's another view of the probabilities for each:
bins_and_balls http://bents.us/Pictures/bins_balls.png
Thanks very much for reading my question and for any help you can provide. I have not cross-posted this on https://stats.stackexchange.com/ but if people think it is better there, then I'll delete this one and repost it there.
UPDATE
There is been some discussion in the comments about the correctness of some of the proposed algorithms. To help validate, I wrote the following simulator:
#! /usr/bin/env python
from __future__ import division
import random
def simulate(num_bucks,items,bsize,iterations=50000):
perms = dict()
for n in range(iterations):
buckets = [0] * num_bucks
for i in range(items):
while True:
b = random.randint(0,num_bucks-1)
if buckets[b] < bsize:
break # kludge, loop until we find an unfilled bucket
buckets[b] +=1
buckets.sort()
buckets = tuple(reversed(buckets))
try:
perms[buckets]['count'] += 1
except KeyError:
perms[buckets] = {'perm' : buckets, 'count' : 1}
for perm in perms.values():
perm['prob'] = perm['count'] / iterations
return perms
def main():
perms = simulate(num_bucks=3,items=3,bsize=2)
for perm in perms.values():
print perm
if __name__ == "__main__":
main()
Which has output for 3 buckets, 3 balls, bucketsize of 2 like:
(1, 1, 1) 0.22394
(2, 1, 0) 0.77606
Overview
When distributing e.g. 3 items over 4 buckets, you can put the first item into any of the 4 buckets, then for each of these 4 possibilities you can put the second item into any of the 4 buckets, and then for each of these 16 possibilities you can put the third item into any of the 4 buckets, to make a total of 64 possibilities:
(3,0,0,0) (2,1,0,0) (2,0,1,0) (2,0,0,1) (2,1,0,0) (1,2,0,0) (1,1,1,0) (1,1,0,1) (2,0,1,0) (1,1,1,0) (1,0,2,0) (1,0,1,1) (2,0,0,1) (1,1,0,1) (1,0,1,1) (1,0,0,2) (2,1,0,0) (1,2,0,0) (1,1,1,0) (1,1,0,1) (1,2,0,0) (0,3,0,0) (0,2,1,0) (0,2,0,1) (1,1,1,0) (0,2,1,0) (0,1,2,0) (0,1,1,1) (1,1,0,1) (0,2,0,1) (0,1,1,1) (0,1,0,2) (2,0,1,0) (1,1,1,0) (1,0,2,0) (1,0,1,1) (1,1,1,0) (0,2,1,0) (0,1,2,0) (0,1,1,1) (1,0,2,0) (0,1,2,0) (0,0,3,0) (0,0,2,1) (1,0,1,1) (0,1,1,1) (0,0,2,1) (0,0,1,2) (2,0,0,1) (1,1,0,1) (1,0,1,1) (1,0,0,2) (1,1,0,1) (0,2,0,1) (0,1,1,1) (0,1,0,2) (1,0,1,1) (0,1,1,1) (0,0,2,1) (0,0,1,2) (1,0,0,2) (0,1,0,2) (0,0,1,2) (0,0,0,3)
You will notice that there are many duplicates in this list, e.g. (0,1,0,2) appears 3 times. That is because the 3 items are anonymous and identical, so there are several ways in which you can distribute the 3 items over the 4 buckets and arrive at the same result, e.g.:
(0,0,0,0) → (0,1,0,0) → (0,1,0,1) → (0,1,0,2)
(0,0,0,0) → (0,0,0,1) → (0,1,0,1) → (0,1,0,2)
(0,0,0,0) → (0,0,0,1) → (0,0,0,2) → (0,1,0,2)
Also, because the buckets are anonymous and identical, permutations like (2,0,1,0) and (0,1,2,0) are considered equal. Therefor, the 64 possibilities are grouped together to form these 3 classes:
(3,0,0,0) x4
(2,1,0,0) x36
(1,1,1,0) x24
The probability for each of these classes is then the number of possibilities that are part of the class, divided by the total number of possibilities:
(3,0,0,0) → 4 / 64 = 1 / 16
(2,1,0,0) → 36 / 64 = 9 / 16
(1,1,1,0) → 24 / 64 = 6 / 16
An algorithm which generates all the possibilities and then groups them together into classes is not very efficient, because the number of possibilities quickly becomes huge for higher numbers of items and buckets, as you have experienced.
It is better to generate the classes (step 1 below), calculate the number of permutations per class (step 2 below), calculate the number of duplicate distributions per permutation (step 3 below), and then combine the results to get the probability for each class (step 4 below).
Step 1: generating the partitions
First, generate the partitions (or more precisely: partitions with restricted number of parts) of i with b parts:
(3,0,0,0) (2,1,0,0) (1,1,1,0)
The basic way to generate partitions is to use a recursive algorithm, where you choose all possible values for the first part, and then recurse with the rest, e.g.:
partition(4) :
• 4
• 3 + partition(1)
• 2 + partition(2)
• 1 + partition(3)
→ (4) (3,1) (2,2) (2,1,1) (1,3) (1,2,1) (1,1,2) (1,1,1,1)
In order to avoid duplicates, only generate non-increasing sequences of values. To achieve this, you have to tell every recursion what the maximum value of its parts is, e.g.:
partition(4, max=4) :
• 4
• 3 + partition(1, max=3)
• 2 + partition(2, max=2)
• 1 + partition(3, max=1)
→ (4) (3,1) (2,2) (2,1,1) (1,1,1,1)
When partitioning with restricted number of parts, the first part should not be less than the total value divided by the number of parts, e.g. when partitioning 4 into 3 parts, there's no partition with first part 1, because 1 < 4/3:
partition(4, parts=3, max=4) :
• 4,0,0
• 3 + partition(1, parts=2, max=3)
• 2 + partition(2, parts=2, max=2)
→ (4,0,0) (3,1,0) (2,2,0) (2,1,1)
When the number of parts is 2, this is the lowest recursion level. The partitions are then generated by iterating over the values for the first part, from the maximum down to half of the value, and putting the rest in the second part, e.g.:
partition(4, parts=2, max=3) :
• 3,1
• 2,2
→ (3,1) (2,2)
Since a recursive partitioning algorithm reduces each task to a number of tasks with smaller values and fewer parts, the same partitions with small values are generated many times. That's why techniques like memoization and tabulation can speed up the algorithm significantly.
Step 2: calculating the number of permutations
Then, find the number of permutations of each partition:
(3,0,0,0) (0,3,0,0) (0,0,3,0) (0,0,0,3) → 4
(2,1,0,0) (2,0,1,0) (2,0,0,1) (1,2,0,0) (0,2,1,0) (0,2,0,1) (1,0,2,0) (0,1,2,0) (0,0,2,1) (1,0,0,2) (0,1,0,2) (0,0,1,2) → 12
(1,1,1,0) (1,1,0,1) (1,0,1,1) (0,1,1,1) → 4
You don't need to generate all these permutations to calculate how many there are; you only have to check how many duplicate values there are in every partition generated in step 1:
(3,0,0,0) (2,1,0,0) (1,1,1,0)
Take the factorial of the number of buckets b (4!) and divide it by the product of the factorial of the number of duplicate values:
(3,0,0,0) → 4! / (1! × 3!) = 4
(2,1,0,0) → 4! / (1! × 1! × 2!) = 12
(1,1,1,0) → 4! / (3! × 1!) = 4
Step 3: calculating the number of distributions
Then, for each partition, we have to find the number of ways in which you can distribute the items in the buckets, e.g. for partition (2,1,0,0):
• first item in bucket 1, second item in bucket 1, third item in bucket 2 → (2,1,0,0)
• first item in bucket 1, second item in bucket 2, third item in bucket 1 → (2,1,0,0)
• first item in bucket 2, second item in bucket 1, third item in bucket 1 → (2,1,0,0)
This number can be calculated by taking the factorial of the number of items i (3!) and dividing it by the product of the factorial of the values (ignoring the zeros, because 0! = 1):
(3,0,0,0) → 3! / 3! = 1
(2,1,0,0) → 3! / (2! × 1!) = 3
(1,1,1,0) → 3! / (1! × 1! × 1!) = 6
Step 4: calculating the probability
Then, for each partition, multiply the number of permutations and the number of distributions, and sum these to find the total number of possibilities:
4×1 + 12×3 + 4×6 = 64
Or simply use this calculation:
buckets ^ items = 4 ^ 3 = 64
And then the probability per partition is the number of permutations times the number of distributions, divided by the total number of possibilities:
(3,0,0,0) → 4×1 / 64 = 4 / 64 = 1 / 16
(2,1,0,0) → 12×3 / 64 = 36 / 64 = 9 / 16
(3,0,0,0) → 4×6 / 64 = 24 / 64 = 6 / 16
Code example
This is an example of the basic algorithm without memoization:
function partitions(items, buckets) {
var p = [];
partition(items, buckets, items, []);
probability();
return p;
function partition(val, parts, max, prev) {
for (var i = Math.min(val, max); i >= val / parts; i--) {
if (parts == 2) p.push({part: prev.concat([i, val - i])});
else partition(val - i, parts - 1, i, prev.concat([i]));
}
}
function probability() {
var total = Math.pow(buckets, items);
for (var i = 0; i < p.length; i++) {
p[i].prob = permutations(p[i].part) * distributions(p[i].part) / total;
}
}
function permutations(partition) {
var dup = 1, perms = factorial(buckets);
for (var i = 1; i < buckets; i++) {
if (partition[i] != partition[i - 1]) {
perms /= factorial(dup);
dup = 1;
} else ++dup;
}
return perms / factorial(dup);
}
function distributions(partition) {
var dist = factorial(items);
for (var i = 0; i < buckets; i++) {
dist /= factorial(partition[i]);
}
return dist;
}
function factorial(n) {
var f = 1;
while (n > 1) f *= n--;
return f;
}
}
var result = partitions(3,4);
for (var i in result) document.write(JSON.stringify(result[i]) + "<br>");
var result = partitions(7,5);
for (var i in result) document.write("<br>" + JSON.stringify(result[i]));
Additional restriction: bucket size
To adapt the results given by the algorithm to the situation where there is a maximum number of items any bucket can hold, you'd need to identify the invalid ditributions, and then redistribute their probabilities among the valid distributions, e.g.:
i=5, b=4, s=5
5,0,0,0 → 1 / 256
4,1,0,0 → 15 / 256
3,2,0,0 → 30 / 256
3,1,1,0 → 60 / 256
2,2,1,0 → 90 / 256
2,1,1,1 → 60 / 256
i=5, b=4, s=4
4,1,0,0 → 16 / 256 ← (15 + 1)
3,2,0,0 → 30 / 256
3,1,1,0 → 60 / 256
2,2,1,0 → 90 / 256
2,1,1,1 → 60 / 256
i=5, b=4, s=3
3,2,0,0 → 35.333 / 256 ← (30 + 16 × 1/3)
3,1,1,0 → 70.666 / 256 ← (60 + 16 × 2/3)
2,2,1,0 → 90 / 256
2,1,1,1 → 60 / 256
i=5, b=4, s=2
2,2,1,0 → 172.444 / 256 ← (90 + 35.333 + 70.666 × 2/3)
2,1,1,1 → 83.555 / 256 ← (60 + 70.666 × 1/3)
The diagram below shows which probabilities have to be recalculated when restricting a 7-item 3-bucket distribution to a bucket size of 4.
The algorithm for unrestricted bucket size calculates the bottom row in this diagram, without having to create the whole tree. You can adapt it to directly calculate the probability of any single distribution (see code examples below).
So you could run the original algorithm first and then, for each invalid distribution, find which distributions in the previous row lead to the invalid distribution, and redistribute the probabilities.
In the example shown in the diagram, that would mean redistributing the probabilities indicated in red, which is the sub-tree under (4,0,0) where 4 is the maximum bucket size.
So in general, to adapt the probabilities found with the original algorithm to a restricted bucket size of s
, you start at (s,0,0), find all children of this distribution: (s+1,0,0) and (s,1,0), calculate their probability, take the probability of the invalid distributions and redistribute them among the valid distributions, and then move on to the next row, until you've recalculated all distributions in the final row that have the value s
in them.
The code example below, based on the original algorithm, shows how to directly find the probability of any distribution with unrestricted bucket size. This would be a basic step in the new algorithm.
function probability(part) { // array of items per bucket
var items = part.reduce(function(s, n) {return s + n}, 0);
var buckets = part.length;
var f = factorial(Math.max(items, buckets));
var total = Math.pow(buckets, items);
return permutations() * distributions() / total;
function permutations() {
var dup = 1, perms = f[buckets];
for (var i = 1; i < buckets; i++) {
if (part[i] != part[i - 1]) {
perms /= f[dup];
dup = 1;
} else ++dup;
}
return perms / f[dup];
}
function distributions() {
return part.reduce(function(t, n) {return t / f[n]}, f[items]);
}
function factorial(max) {
var f = [1];
for (var i = 1; i <= max; i++) f[i] = f[i - 1] * i;
return f;
}
}
document.write(probability([2,2,1,1,1]));
The following code example uses a different notation for the partitions: instead of an array where the values are the number of items per bucket, it uses values that are the number of buckets that have index items; so [0,3,2,0] means 3 buckets with 1 item and 2 buckets with 2 items, or [2,2,2,1,1] in the regular notation.
function probability(part) { // array of buckets per amount of items
var items = part.reduce(function(t, n, i) {return t + n * i}, 0);
var buckets = part.reduce(function(t, n) {return t + n}, 0);
var f = factorial(Math.max(items, buckets));
var total = Math.pow(buckets, items);
return permutations() * distributions() / total;
function permutations() {
return part.reduce(function(t, n) {return t / f[n]}, f[buckets]);
}
function distributions() {
return part.reduce(function(t, n, i) {return t / Math.pow(f[i], n)}, f[items]);
}
function factorial(max) {
var f = [1];
for (var i = 1; i <= max; i++) f[i] = f[i - 1] * i;
return f;
}
}
document.write(probability([0,3,2,0,0,0,0,0]));
Below is an implementation of the complete algorithm using the alternative notation; it generates the valid distributions (taking into account the bucket size), calculates the probabilities they would have with unrestricted bucket size, and then redistributes the probabilities of the invalid distributions.
function Dictionary() {
this.pair = {};
this.add = function(array, value) {
var key = array.join(',');
if (key in this.pair) this.pair[key] += value;
else this.pair[key] = value;
}
this.read = function() {
for (var key in this.pair) {
var prob = this.pair[key], array = key.split(',');
array.forEach(function(v, i, a) {a[i] = parseInt(v);});
delete this.pair[key];
return {dist: array, prob: prob};
}
}
}
function partitions(items, buckets, size) {
var dict = new Dictionary, init = [buckets], total = Math.pow(buckets, items), f = factorial();
for (var i = 0; i < size; i++) init.push(0);
partition(items, init, 0);
if (size < items) redistribute();
return dict;
function partition(val, dist, min) {
for (var i = min; i < size; i++) {
if (dist[i]) {
var d = dist.slice();
--d[i]; ++d[i + 1];
if (val - 1) partition(val - 1, d, i);
else dict.add(d, probability(d));
}
}
function probability(part) {
return part.reduce(function(t, n, i) {return t / Math.pow(f[i], n);}, f[items])
* part.reduce(function(t, n) {return t / f[n];}, f[buckets]) / total;
}
}
function redistribute() {
var parents = new Dictionary;
--init[0]; ++init[size];
parents.add(init, 0);
for (var i = size; i < items; i++) {
var children = new Dictionary;
while (par = parents.read()) {
var options = buckets - par.dist[size];
par.prob += probability(par.dist, i) * par.dist[size] / buckets;
for (var j = 0, b = 0; j < options; j += par.dist[b++]) {
while (par.dist[b] == 0) ++b;
var child = par.dist.slice();
--child[b]; ++child[b + 1];
if (i == items - 1) dict.add(child, par.prob * par.dist[b] / options);
else children.add(child, par.prob * par.dist[b] / options);
}
}
parents = children;
}
function probability(part, items) {
return part.reduce(function(t, n, i) {return t / Math.pow(f[i], n);}, f[items])
* part.reduce(function(t, n) {return t / f[n];}, f[buckets]) / Math.pow(buckets, items);
}
}
function factorial() {
var f = [1], max = Math.max(items, buckets);
for (var i = 1; i <= max; i++) f[i] = f[i - 1] * i;
return f;
}
}
var result = partitions(3,3,2);
for (var i in result.pair) document.write(i + " → " + result.pair[i] + "<br>");
var result = partitions(7,3,4);
for (var i in result.pair) document.write("<br>" + i + " → " + result.pair[i]);
(Below is a first idea; it turned out to be too inefficient.)
While working on this, I came up with a completely different version of the algorithm, using a dictionary. It is probably closer to what you were originally doing, but should be more efficient. Basically, I'm using a different way of storing a distribution; where I previously wrote:
(3,2,2,1,0)
for 5 buckets filled with 3, 2, 2, 1 and zero items, I now use:
(1,1,2,1)
for 1 empty bucket, 1 bucket with 1 item, 2 buckets with 2 items, and 1 bucket with 3 items. The length of this list is determined by the bucket size plus 1, and no longer by the number of buckets.
The advantage of this is that duplicates like:
(3,2,2,1,0) (3,1,2,0,2) (1,2,2,3,0) (0,1,2,2,3) ...
Now all have the same notation.
The code example below uses the dictionary idea with the new notation. (Unfortunately, it turns out to be quite slow, so I'll continue to look into improving it.)
function Dictionary() {
this.dict = [];
}
Dictionary.prototype.add = function(array, value) {
var key = array.join(',');
if (this.dict[key] == undefined) this.dict[key] = value;
else this.dict[key] += value;
}
Dictionary.prototype.print = function() {
for (var i in this.dict) document.write(i + " → " + this.dict[i] + "<br>");
}
function probability(items, buckets, size) {
var dict = new Dictionary, dist = [buckets];
for (var i = 0; i < size; i++) dist.push(0);
p(dist, 0, 1);
return dict;
function p(dist, value, prob) {
var options = buckets - dist[size];
for (var i = 0, b = 0; i < options; i += dist[b++]) {
while (dist[b] == 0) ++b;
var d = dist.slice();
--d[b]; ++d[b + 1];
if (value + 1 < items) p(d, value + 1, prob * dist[b] / options);
else dict.add(d, prob * dist[b] / options);
}
}
}
probability(3, 3, 3).print();
probability(3, 3, 2).print();
probability(5, 4, 5).print();
probability(5, 4, 2).print();