Generate a number is range (1,n) but not in a list

2019-05-03 01:44发布

How can I generate a random number that is in the range (1,n) but not in a certain list (i,j)?

Example: range is (1,500), list is [1,3,4,45,199,212,344].

Note: The list may not be sorted

4条回答
孤傲高冷的网名
2楼-- · 2019-05-03 02:18

Rejection Sampling

One method is rejection sampling:

  1. Generate a number x in the range (1, 500)
  2. Is x in your list of disallowed values? (Can use a hash-set for this check.)
    • If yes, return to step 1
    • If no, x is your random value, done

This will work fine if your set of allowed values is significantly larger than your set of disallowed values:
if there are G possible good values and B possible bad values, then the expected number of times you'll have to sample x from the G + B values until you get a good value is (G + B) / G (the expectation of the associated geometric distribution). (You can sense check this. As G goes to infinity, the expectation goes to 1. As B goes to infinity, the expectation goes to infinity.)

Sampling a List

Another method is to make a list L of all of your allowed values, then sample L[rand(L.count)].

查看更多
Rolldiameter
3楼-- · 2019-05-03 02:19

Rejection sampling would be the simplest if possible as described already. However, if you didn't want use that, you could convert the range and disallowed values to sets and find the difference. Then, you could choose a random value out of there.

Assuming you wanted the range to be in [1,n] but not in [i,j] and that you wanted them uniformly distributed.

In Python

total = range(1,n+1)
disallowed = range(i,j+1)
allowed = list( set(total) - set(disallowed) )

return allowed[random.randrange(len(allowed))]

(Note that this is not EXACTLY uniform since in all likeliness, max_rand%len(allowed) != 0 but this will in most practical applications be very close)

查看更多
闹够了就滚
4楼-- · 2019-05-03 02:19

I assume that you know how to generate a random number in [1, n) and also your list is ordered like in the example above.

Let's say that you have a list with k elements. Make a map(O(logn)) structure, which will ensure speed if k goes higher. Put all elements from list in map, where element value will be the key and "good" value will be the value. Later on I'll explain about "good" value. So when we have the map then just find a random number in [1, n - k - p)(Later on I'll explain what is p) and if this number is in map then replace it with "good" value.

"GOOD" value -> Let's start from k-th element. It's good value is its own value + 1, because the very next element is "good" for us. Now let's look at (k-1)th element. We assume that its good value is again its own value + 1. If this value is equal to k-th element then the "good" value for (k-1)th element is k-th "good" value + 1. Also you will have to store the largest "good" value. If the largest value exceed n then p(from above) will be p = largest - n.

Of course I recommend you this only if k is big number otherwise @Timothy Shields' method is perfect.

查看更多
该账号已被封号
5楼-- · 2019-05-03 02:24

The technique I usually use when the list is length 1 is to generate a random integer r in [1,n-1], and if r is greater or equal to that single illegal value then increment r.

This can be generalised for a list of length k for small k but requires sorting that list (you can't do your compare-and-increment in random order). If the list is moderately long, then after the sort you can start with a bsearch, and add the number of values skipped to r, and then recurse into the remainder of the list.

For a list of length k, containing no value greater or equal to n-k, you can do a more direct substitution: generate random r in [1,n-k], and then iterate through the list testing if r is equal to list[i]. If it is then set r to n-k+i (this assumes list is zero-based) and quit.

That second approach fails if some of the list elements are in [n-k,n].

I could try to invest something clever at this point, but what I have so far seems sufficient for uniform distributions with values of k much less than n...

  1. Create two lists -- one of illegal values below n-k, and the other the rest (this can be done in place).
  2. Generate random r in [1,n-k]
  3. Apply the direct substitution approach for the first list (if r is list[i] then set r to n-k+i and go to step 5).
  4. If r was not altered in step 3 then we're finished.
  5. Sort the list of larger values and use the compare-and-increment method.

Observations:

  • If all values are in the lower list, there will be no sort because there is nothing to sort.
  • If all values are in the upper list, there will be no sort because there is no occasion on which r is moved into the hazardous area.
  • As k approaches n, the maximum size of the upper (sorted) list grows.
  • For a given k, if more value appear in the upper list (the bigger the sort), the chance of getting a hit in the lower list shrinks, reducing the likelihood of needing to do the sort.

Refinement: Obviously things get very sorty for large k, but in such cases the list has comparatively few holes into which r is allowed to settle. This could surely be exploited.

I might suggest something different if many random values with the same list and limits were needed. I hope that the list of illegal values is not the list of results of previous calls to this function, because if it is then you wouldn't want any of this -- instead you would want a Fisher-Yates shuffle.

查看更多
登录 后发表回答