I need to generate a file filled with three "random" values per line (10 lines), but those values sum must equal 15.
The structure is: "INDEX A B C".
Example:
1 15 0 0
2 0 15 0
3 0 0 15
4 1 14 0
5 2 13 0
6 3 12 0
7 4 11 0
8 5 10 0
9 6 9 0
10 7 8 0
If the numbers can by any just use combinations:
from itertools import combinations
with open("rand.txt","w") as f:
combs = [x for x in combinations(range(16),3) if sum(x ) == 15 ][:10]
for a,b,c in combs:
f.write("{} {} {}\n".format(a,b,c))
If you want to avoid needing to create (or iterate through) the full space of satisfying permutations (which, for large N
is important), then you can solve this problem with sequential sample.
My first approach was to just draw a value uniformly from [0, N], call it x
. Then draw a value uniformly from [0, N-x
] and call it y, then set z = N - x - y
. If you then shuffle these three, you'll get back a reasonable draw from the space of solutions, but it won't be exactly uniform.
As an example, consider where N=3
. Then the probability of some permutation of (3, 0, 0) is 1/4, even though it is only one out of 10 possible triplets. So this privileges values that contain a high max.
You can perfectly counterbalance this effect by sampling the first value x
proportionally to how many values will be possible for y
conditioned on x
. So for example, if x
happened to be N
, then there is only 1 compatible value for y
, but if x
is 0, then there are 4 compatible values, namely 0 through 3.
In other words, let Pr(X=x)
be (N-x+1)/sum_i(N-i+1)
for i
from 0 to N
. Then let Pr(Y=y | X=x)
be uniform on [0, N-x].
This works out to P(X,Y) = P(Y|X=x) * P(X) = 1/(N-x+1) * [N - x + 1]/sum_i(N-i+1), which is seen to be uniform, 1/sum_i(N-i+1), for each candidate triplet.
Note that sum(N-i+1 for i in range(0, N+1))
gives the number of different ways to sum 3 non-negative integers to get N
. I don't know a good proof of this, and would happy if someone adds one to the comments!
Here's a solution that will sample this way:
import random
from collections import Counter
def discrete_sample(weights):
u = random.uniform(0, 1)
w_t = 0
for i, w in enumerate(weights):
w_t += w
if u <= w_t:
return i
return len(weights)-1
def get_weights(N):
vals = [(N-i+1.0) for i in range(0, N+1)]
totl = sum(vals)
return [v/totl for v in vals]
def draw_summing_triplet(N):
weights = get_weights(N)
x = discrete_sample(weights)
y = random.randint(0, N-x)
triplet = [x, y, N - x - y]
random.shuffle(triplet)
return tuple(triplet)
Much credit goes to @DSM in the comments for questioning my original answer and providing good feedback.
In this case, we can test out the sampler like this:
foo = Counter(draw_summing_triplet(3) for i in range(10**6))
print foo
Counter({(1, 2, 0): 100381,
(0, 2, 1): 100250,
(1, 1, 1): 100027,
(2, 1, 0): 100011,
(0, 3, 0): 100002,
(3, 0, 0): 99977,
(2, 0, 1): 99972,
(1, 0, 2): 99854,
(0, 0, 3): 99782,
(0, 1, 2): 99744})
This seems straight forward to me and it utilizes the random module.
import random
def foo(x):
a = random.randint(0,x)
b = random.randint(0,x-a)
c = x - (a +b)
return (a,b,c)
for i in range(100):
print foo(15)