Generating a normally distributed list of random i

2019-09-01 02:22发布

问题:

I need to generate a list of random integer values between a lower and an upper limit for a normal distribution using a given mean. For example: a bell distribution of 150 random values between 1 and 10, with a mean of 3. The mean can be approximated (e.g. a float between 2.8 and 3.2) but the numbers of the list need to be integers.

This thread seems to point to possible directions. With NORM.INV(RAND(),mean,dev) I can generate floats and then transform them to integers, but I don't know how to limit the minimum and maximum values. RANDBETWEEN() could solve this but it does not seem to work with NORM.INV

回答1:

The normal distribution is a continuous distribution with an infinite range, so you can't accomplish your stated goal. You can discretize it with rounding/ceiling/floor, but then you also have to reject any outcomes that are outside your desired range and repeat the attempt -- not the easiest thing to in Excel, and both the discretization and truncation will shift your mean.

What you can do is generate from a discrete distribution which has limit behaviors that converge to a normal. Two such distributions are the Poisson and the binomial. Since you want a fixed range, I'd suggest the binomial.

A binomial describes how many "successes" you will get from n independent trials, each of which has probability p of yielding "success." The range of a binomial is 0,...,n, and its expected value is n*p. If you want an actual range of 1,...,10 and a mean of 3, you should generate a binomial with n = 9 and a mean of 2, then shift it by adding 1. To get a mean of 2 with a sample size of 9, you need p = 2/9.

In summary, generate 150 instances of x = binomial(n = 9, p = 2/9) + 1. Analysis Tookpak or add-ins such as Crystal Ball or @RISK can generate binomials for you, or you can just sum 9 cells which contain 1 if RANDOM <= 2/9, 0 otherwise.



回答2:

RANDBETWEEN returns and takes integers as parameters which is not useful here.

You could use something like NORM.INV(RAND() * 0.8 + 0.1, mean, dev) which will omit the bottom and top 10% of the distribution.

If you need to know how to calibrate the amount of probability you want to omit then let me know. (It depends on the standard deviation as well as the mean).