Java: random integer with non-uniform distribution

2019-01-22 01:37发布

How can I create a random integer n in Java, between 1 and k with a "linear descending distribution", i.e. 1 is most likely, 2 is less likely, 3 less likely, ..., k least likely, and the probabilities descend linearly, like this:

enter image description here

I know that there are dosens of threads on this topic already, and I apologize for making a new one, but I can't seem to be able to create what I need from them. I know that using import java.util.*;, the code

Random r=new Random();
int n=r.nextInt(k)+1;

creates a random integer between 1 and k, distributed uniformly.

GENERALIZATION: Any hints for creating an arbitrarily distributed integer, i.e. f(n)=some function, P(n)=f(n)/(f(1)+...+f(k))), would also be appreciated, for example: enter image description here.

10条回答
等我变得足够好
2楼-- · 2019-01-22 01:56

The Cumulative Distribution Function is x^2 for a triangular distribution [0,1] with mode (highest weighted probability) of 1, as shown here.

Therefore, all we need to do to transform a uniform distribution (such as Java's Random::nextDouble) into a convenient triangular distribution weighted towards 1 is: simply take the square root Math.sqrt(rand.nextDouble()), which can then multiplied by any desired range.

For your example:

int a = 1; // lower bound, inclusive
int b = k; // upper bound, exclusive
double weightedRand = Math.sqrt(rand.nextDouble()); // use triangular distribution
weightedRand = 1.0 - weightedRand; // invert the distribution (greater density at bottom)
int result = (int) Math.floor((b-a) * weightedRand);
result += a; // offset by lower bound
if(result >= b) result = a; // handle the edge case 
查看更多
forever°为你锁心
3楼-- · 2019-01-22 02:02

Let me try another answer too, inspired by rlibby. This particular distribution is also the distribution of the smaller of two values chosen uniformly and random from the same range.

查看更多
啃猪蹄的小仙女
4楼-- · 2019-01-22 02:08

There is no need to simulate this with arrays and such, if your distribution is such that you can compute its cumulative distribution function (cdf). Above you have a probability distribution function (pdf). h is actually determined, since the area under the curve must be 1. For simplicity of math, let me also assume you're picking a number in [0,k).

The pdf here is f(x) = (2/k) * (1 - x/k), if I read you right. The cdf is just integral of the pdf. Here, that's F(x) = (2/k) * (x - x^2 / 2k). (You can repeat this logic for any pdf function if it's integrable.)

Then you need to compute the inverse of the cdf function, F^-1(x) and if I weren't lazy, I'd do it for you.

But the good news is this: once you have F^-1(x), all you do is apply it to a random value distribution uniformly in [0,1] and apply the function to it. java.util.Random can provide that with some care. That's your randomly sampled value from your distribution.

查看更多
家丑人穷心不美
5楼-- · 2019-01-22 02:08

The simplest thing to do it to generate a list or array of all the possible values in their weights.

int k = /* possible values */
int[] results = new int[k*(k+1)/2];
for(int i=1,r=0;i<=k;i++)
   for(int j=0;j<=k-i;j++)
       results[r++] = i;
// k=4 => { 1,1,1,1,2,2,2,3,3,4 }

// to get a value with a given distribution.
int n = results[random.nextInt(results.length)];

This best works for relatively small k values.ie. k < 1000. ;)

For larger numbers you can use a bucket approach

int k = 
int[] buckets = new int[k+1];
for(int i=1;i<k;i++)
   buckets[i] = buckets[i-1] + k - i + 1;

int r = random.nextInt(buckets[buckets.length-1]);
int n = Arrays.binarySearch(buckets, r);
n = n < 0 ? -n : n + 1;

The cost of the binary search is fairly small but not as efficient as a direct look up (for a small array)


For an arbitary distrubution you can use a double[] for the cumlative distrubution and use a binary search to find the value.

查看更多
老娘就宠你
6楼-- · 2019-01-22 02:14

So we need the following distribution, from least likely to most likely:

*
**
***
****
*****

etc.

Lets try mapping a uniformly distributed integer random variable to that distribution:

1
2  3
4  5  6
7  8  9  10
11 12 13 14 15

etc.

This way, if we generate a uniformly distributed random integer from 1 to, say, 15 in this case for K = 5, we just need to figure out which bucket it fits it. The tricky part is how to do this.

Note that the numbers on the right are the triangular numbers! This means that for randomly-generated X from 1 to T_n, we just need to find N such that T_(n-1) < X <= T_n. Fortunately there is a well-defined formula to find the 'triangular root' of a given number, which we can use as the core of our mapping from uniform distribution to bucket:

// Assume k is given, via parameter or otherwise
int k;

// Assume also that r has already been initialized as a valid Random instance
Random r = new Random();

// First, generate a number from 1 to T_k
int triangularK = k * (k + 1) / 2;

int x = r.nextInt(triangularK) + 1;

// Next, figure out which bucket x fits into, bounded by
// triangular numbers by taking the triangular root    
// We're dealing strictly with positive integers, so we can
// safely ignore the - part of the +/- in the triangular root equation
double triangularRoot = (Math.sqrt(8 * x + 1) - 1) / 2;

int bucket = (int) Math.ceil(triangularRoot);

// Buckets start at 1 as the least likely; we want k to be the least likely
int n = k - bucket + 1;

n should now have the specified distribution.

查看更多
虎瘦雄心在
7楼-- · 2019-01-22 02:15

This is called a triangular distribution, although yours is a degenerate case with the mode equal to the minimum value. Wikipedia has equations for how to create one given a uniformly distributed (0,1) variable.

查看更多
登录 后发表回答