How to pick a random choice using a custom probabi

2019-02-25 05:49发布

问题:

I have a list of US names and their respective names from the US census website. I would like to generate a random name from this list using the given probability. The data is here: US Census data

I have seen algorithms like the roulette wheel selection algorithm that are easy to implement, but I wanted to know if there was any way to generate random names in O(1). For histogram data this is easier, as you could create a hash of integers to birthdays, but I would like to do this for a continuous distribution.

If this is not possible, are there any python modules that take in probability distributions and generate random values based on those distributions?

回答1:

There is an O(1)-time method See this detailed description of Vose's "alias" method. Unfortunately, it suffers from high initialization cost. For comparative timings of simpler methods, see Eli Bendersky's blog post. More timings can be found in this from the Python issue tracker.



回答2:

These days it's practical to enumerate the entire US population (~317 million) if you really need O(1) lookup. Just pick a number up to 317 million and get the name from there. (317000000*4 bytes = 1.268GB)

I think there are lots of O(log n) ways. Is there a particular reason you need O(1) (They will use a lot less memory)