Choosing random numbers efficiently

2019-01-07 22:06发布

问题:

I have a method, which uses random samples to approximate a calculation. This method is called millions of times, so its very important that the process of choosing the random numbers is efficient.

I'm not sure how fast javas Random().nextInt really are, but my program does not seem to benefit as much as I would like it too.

When choosing the random numbers, I do the following (in semi pseudo-code):

// Repeat this 300000 times
Set set = new Set();
while(set.length != 5)
    set.add(randomNumber(MIN,MAX));

Now, this obviously has a bad worst-case running time, as the random-function in theory can add duplicated numbers for an eternity, thus staying in the while-loop forever. However, the numbers are chosen from {0..45}, so a duplicated value is for the most part unlikely.

When I use the above method, its only 40% faster than my other method, which does not approximate, but yields the correct result. This is ran ~ 1 million times, so I was expecting this new method to be at least 50% faster.

Do you have any suggestions for a faster method? Or maybe you know of a more efficient way of generation a set of random numbers.

To clarify, here is the two methods:

// Run through all combinations (1 million). This takes 5 seconds
 for(int c1 = 0; c1 < deck.length; c1++){
    for(int c2 = c1+1; c2 < deck.length; c2++){
     for(int c3 = c2+1; c3 < deck.length; c3++){
        for(int c4 = c3+1; c4 < deck.length; c4++){
         for(int c5 = c4+1; c5 < deck.length; c5++){
             enumeration(hands, cards, deck, c1, c2, c3, c4, c5);
         }
            } 
      }     
   }
   }

// Approximate (300000 combinations). This takes 3 seconds
Random rand = new Random();
HashSet<Integer> set = new HashSet<Integer>();
int[] numbers = new int[5];
while(enumerations < 300000){
set.clear();
while(set.size() != 5){
    set.add(rand.nextInt(deck.length));
}
Iterator<Integer> i = set.iterator();
int n = 0;
while(i.hasNext()){
    numbers[n] = i.next();
    n++;
}

After some testing and profiling, I found this method to be the most effective:

Random rand = new Random();
int[] numbers = new int[5];
ArrayList<Integer> list = new ArrayList<Integer>();
while(enumerations < 300000){
 while(list.size() != 5) {
     int i = rand.nextInt(deck.length);
        if(!list.contains(i)) list.add(i);
 }
 int index = 0;
 for(int i : list){ numbers[index] = i; index++; }
 enumeration(hands, cards, deck,numbers);
}

回答1:

You can try using an existing Java implementation (or this one) for a Mersenne Twister.

Keep in mind most MT's are not cryptographically secure.



回答2:

It looks like you want to select a k-combination from a set S without replacement, with S having n distinct values, k = 5 and n = 52. You can shuffle() the entire set and select k elements (as @Tesserex suggests), or pick() k elements while avoiding duplicates (as you've shown). You'll want to profile both in your particular environment and for your chosen generator. I often, but not always, see a modest edge for pick().

private static final Random rnd = new Random();
private static final int N = 52;
private static final int K = 5;
private static final List<Integer> S = new ArrayList<Integer>(N);
static {
    for (int i = 0; i < N; i++) {
        S.add(i + 1);
    }
}
private final List<Integer> combination = new ArrayList<Integer>(K);

...

private void shuffle() {
    Collections.shuffle(S, rnd);
    combination.addAll(S.subList(0, K));
}

private void pick() {
    for (int i = 0; i < K; i++) {
        int v = 0;
        do {
            v = rnd.nextInt(N) + 1;
        } while (combination.contains(v));
        combination.add(v);
    }
}


回答3:

A common technique is to start with a list of all the possible inputs, and randomly select from that, deleting ones as you go. That way there's no risk of selecting a duplicate and having to loop for an unknown amount of time. Of course this method only works with discrete data, but fortunately integers are. Also remember that your list (or other data structure) selection and deletion should be O(1) if possible, since you're focusing on speed.



回答4:

You could use linear congruence as a random generator: http://en.wikipedia.org/wiki/Linear_congruential_generator [yet consider their statistical disadvantages]

You only need a calculation of (x + c) % m for each number. Yet, in my experience the creation of objects (like you might do with every call of new Set and add, depending on which implementation you use) might cost you more speed than a call to nextInt(). Maybe you should try a profiler like e.g. this one: http://www.eclipse.org/tptp/



回答5:

I don't have any input on your actual problem, and I don't know too much Java (just poking around). However it seems to me that you are trying to build a hand evaluator for poker and this thread http://pokerai.org/pf3/viewtopic.php?f=3&t=16 contains some extremely fast java hand evaluators. Hopefully some of this code could be of help.



回答6:

If you're being slowed down by the fact that you have to skip over duplicates, you could solve that problem by creating a list of all the card values, and then removing from the list as cards are selected and choosing a random number in a smaller range the next time. Something like this:

// Assuming we're just numbering all the cards 0 to 51. This could be more sophisticated, of course.
ArrayList cards=new ArrayList(52);
for (int x=0;x<52;++x)
  cards=new Integer(x);

Integer[] hand=new Integer[5];
for (int h=0;h<5;++h)
{
  // Pick a card from those remaining
  int n=random.nextInt(cards.size());
  hand[h]=cards.get(n);
  // Remove the picked card from the list
  cards.remove(n);
}

For the first draw, cards.get(n) will return n, no matter what n is. But from then on, values will be removed so cards.get(3) might return 7, etc.

Creating the list and removing from it adds a bunch of overhead. My guess would be that if you're only picking 5 cards at a time, the probabilty of collisions is small enough that eliminating duplices after you find them would be faster than preventing them. Even on the last draw, the probability of a duplicate is only 4/52=1/13, so you'd rarely hit a duplicate at all and the probability that 2 draws in a row would both be duplicates would be tiny. It all depends on how long it takes to generate a random number compared to how long it takes to set up the array and do the removes. The easiest way to tell would be to do some experiments and measure. (Or profile!)



回答7:

Never guess, always measure.

 long time = System.getCurrentMilliseconds();
 Random().nextInt()
 System.out.println(System.getCurrentMilliseconds() - time);

Also, you should never rely on how infrequent a known bug will happen, just code defensivley so it doesn't. Detect a duplicate, and if it is a duplicate then don't add it, and skip the iteration with a continue statement.

As for fastest methods and random numbers... You can't get random numbers in Java's Math.random(). You can only get pseudo random numbers. How fast you want this to be comes at the sacrifice of how seemingly random you need to them appear. The fastest way to generate a pseudo-random number would involve bit shifting and addition based on the a seed value such as System.getCurrentMilliSeconds() Also, pseudo-random number generation is already pretty fast since it is just raw CPU arithmetic anyway, so you will probably be happy enough once you see how many milliseconds it takes to generate one with Math.random().



回答8:

Don't try to develop your known random num generator. Use a known one like SecureRandom instead:

http://www.owasp.org/index.php/Using_the_Java_Cryptographic_Extensions