How can I generate a random number that is in the range (1,n)
but not in a certain list (i,j)
?
Example: range is (1,500)
, list is [1,3,4,45,199,212,344]
.
Note: The list may not be sorted
How can I generate a random number that is in the range (1,n)
but not in a certain list (i,j)
?
Example: range is (1,500)
, list is [1,3,4,45,199,212,344]
.
Note: The list may not be sorted
Rejection Sampling
One method is rejection sampling:
x
in the range (1, 500)x
in your list of disallowed values? (Can use a hash-set for this check.)x
is your random value, doneThis will work fine if your set of allowed values is significantly larger than your set of disallowed values:
if there are
G
possible good values andB
possible bad values, then the expected number of times you'll have to samplex
from theG + B
values until you get a good value is(G + B) / G
(the expectation of the associated geometric distribution). (You can sense check this. AsG
goes to infinity, the expectation goes to 1. AsB
goes to infinity, the expectation goes to infinity.)Sampling a List
Another method is to make a list
L
of all of your allowed values, then sampleL[rand(L.count)]
.Rejection sampling would be the simplest if possible as described already. However, if you didn't want use that, you could convert the range and disallowed values to sets and find the difference. Then, you could choose a random value out of there.
Assuming you wanted the range to be in [1,n] but not in [i,j] and that you wanted them uniformly distributed.
In Python
(Note that this is not EXACTLY uniform since in all likeliness,
max_rand%len(allowed) != 0
but this will in most practical applications be very close)I assume that you know how to generate a random number in [1, n) and also your list is ordered like in the example above.
Let's say that you have a list with k elements. Make a map(O(logn)) structure, which will ensure speed if k goes higher. Put all elements from list in map, where element value will be the key and "good" value will be the value. Later on I'll explain about "good" value. So when we have the map then just find a random number in [1, n - k - p)(Later on I'll explain what is p) and if this number is in map then replace it with "good" value.
"GOOD" value -> Let's start from k-th element. It's good value is its own value + 1, because the very next element is "good" for us. Now let's look at (k-1)th element. We assume that its good value is again its own value + 1. If this value is equal to k-th element then the "good" value for (k-1)th element is k-th "good" value + 1. Also you will have to store the largest "good" value. If the largest value exceed n then p(from above) will be p = largest - n.
Of course I recommend you this only if k is big number otherwise @Timothy Shields' method is perfect.
The technique I usually use when the list is length 1 is to generate a random integer
r
in[1,n-1]
, and ifr
is greater or equal to that single illegal value then incrementr
.This can be generalised for a list of length
k
for smallk
but requires sorting that list (you can't do your compare-and-increment in random order). If the list is moderately long, then after the sort you can start with a bsearch, and add the number of values skipped tor
, and then recurse into the remainder of the list.For a list of length
k
, containing no value greater or equal ton-k
, you can do a more direct substitution: generate randomr
in[1,n-k]
, and then iterate through the list testing ifr
is equal tolist[i]
. If it is then setr
ton-k+i
(this assumeslist
is zero-based) and quit.That second approach fails if some of the list elements are in
[n-k,n]
.I could try to invest something clever at this point, but what I have so far seems sufficient for uniform distributions with values of
k
much less thann
...n-k
, and the other the rest (this can be done in place).r
in[1,n-k]
r
islist[i]
then setr
ton-k+i
and go to step 5).r
was not altered in step 3 then we're finished.Observations:
r
is moved into the hazardous area.k
approachesn
, the maximum size of the upper (sorted) list grows.k
, if more value appear in the upper list (the bigger the sort), the chance of getting a hit in the lower list shrinks, reducing the likelihood of needing to do the sort.Refinement: Obviously things get very sorty for large
k
, but in such cases the list has comparatively few holes into whichr
is allowed to settle. This could surely be exploited.I might suggest something different if many random values with the same list and limits were needed. I hope that the list of illegal values is not the list of results of previous calls to this function, because if it is then you wouldn't want any of this -- instead you would want a Fisher-Yates shuffle.