Performance of choice vs randint

2019-04-21 02:26发布

问题:

I want to pick a random integer between a and b, inclusive.

I know 3 ways of doing it. However, their performance seems very counter-intuitive:

import timeit

t1 = timeit.timeit("n=random.randint(0, 2)", setup="import random", number=100000)
t2 = timeit.timeit("n=random.choice([0, 1, 2])", setup="import random", number=100000)
t3 = timeit.timeit("n=random.choice(ar)", setup="import random; ar = [0, 1, 2]", number=100000)

[print(t) for t in [t1, t2, t3]]

On my machine, this gives:

0.29744589625620965
0.19716156798482648
0.17500512311108346

Using an online interpreter, this gives:

0.23830216699570883
0.16536146598809864
0.15081614299560897

Note how the most direct version (#1) that uses the dedicated function for doing what I'm doing is 50% worse that the strangest version (#3) which pre-defines an array and then chooses randomly from it.

What's going on?

回答1:

It's just implementation details. randint delegates to randrange, so it has another layer of function call overhead, and randrange goes through a lot of argument checking and other crud. In contrast, choice is a really simple one-liner.

Here's the code path randint goes through for this call, with comments and unexecuted code stripped out:

def randint(self, a, b):
    return self.randrange(a, b+1)

def randrange(self, start, stop=None, step=1, _int=int, _maxwidth=1L<<BPF):
    istart = _int(start)
    if istart != start:
        # not executed
    if stop is None:
        # not executed

    istop = _int(stop)
    if istop != stop:
        # not executed
    width = istop - istart
    if step == 1 and width > 0:
        if width >= _maxwidth:
            # not executed
        return _int(istart + _int(self.random()*width))

And here's the code path choice goes through:

def choice(self, seq):
    return seq[int(self.random() * len(seq))]