python 2 vs python 3 performance of random, partic

The issue of the performance of the python random module, and in particular, random.sample and random.shuffle came up in this question. On my computer, I get the following results:

> python  -m timeit -s 'import random' 'random.randint(0,1000)'
1000000 loops, best of 3: 1.07 usec per loop
> python3 -m timeit -s 'import random' 'random.randint(0,1000)'
1000000 loops, best of 3: 1.3 usec per loop

That's more than a 20% degradation of performance in python3 vs python2. It gets much worse.

> python  -m timeit -s 'import random' 'random.shuffle(list(range(10)))'
100000 loops, best of 3: 3.85 usec per loop
> python3 -m timeit -s 'import random' 'random.shuffle(list(range(10)))'
100000 loops, best of 3: 8.04 usec per loop

> python  -m timeit -s 'import random' 'random.sample(range(10),3)'
100000 loops, best of 3: 2.4 usec per loop
> python3 -m timeit -s 'import random' 'random.sample(range(10),3)'
100000 loops, best of 3: 6.49 usec per loop

That represents a 100% degradation in performance for random.shuffle, and almost a 200% degradation for random.sample. That is quite severe.

I used python 2.7.9 and python 3.4.2 in the above tests.

My question: What happened to the random module in python3?

标签： python python-3.x optimization random python-internals

1条回答

成全新的幸福

2楼-- · 2019-03-19 03:32

----------- What Changed -----------------------------------------------

Several things happened:

Integers became less efficient in the int/long unification. That is also why integers are 28 bytes wide now instead of 24 bytes on 64-bit Linux/MacOS builds.
Shuffle became more accurate by using _randbelow. This eliminated a subtle bias in the previous algorithm.
Indexing became slower because the special case for integer indices was removed from ceval.c primarily because it was harder to do with the newer integers and because a couple of the core devs didn't think the optimization was worth it.
The range() function was replaced with xrange(). This is relevant because the OP's timings both use range() in the inner-loop.

The algorithms for shuffle() and sample() were otherwise unchanged.

Python 3 made a number of changes like unicode-everywhere that made the internals more complex, a little slower, and more memory intensive. In return, Python 3 makes it easier for users to write correct code.

Likewise, int/long unification made the language simpler but at a cost of speed and space. The switch to using _randbelow() in the random module had a runtime cost but benefited in terms of accuracy and correctness.

----------- Conclusion --------------------------------------------------

In short, Python 3 is better in some ways that matter to many users and worse in some ways that people rarely notice. Engineering is often about trade-offs.

----------- Details ---------------------------------------------------------

Python2.7 code for shuffle():

def shuffle(self, x, random=None):
    if random is None:
        random = self.random
    _int = int
    for i in reversed(xrange(1, len(x))):
        # pick an element in x[:i+1] with which to exchange x[i]
        j = _int(random() * (i+1))
        x[i], x[j] = x[j], x[i]

Python3.6 code for shuffle():

def shuffle(self, x, random=None):
    if random is None:
        randbelow = self._randbelow
        for i in reversed(range(1, len(x))):
            # pick an element in x[:i+1] with which to exchange x[i]
            j = randbelow(i+1)              # <-- This part changed
            x[i], x[j] = x[j], x[i]
    else:
        _int = int
        for i in reversed(range(1, len(x))):
            # pick an element in x[:i+1] with which to exchange x[i]
            j = _int(random() * (i+1))
            x[i], x[j] = x[j], x[i]

Python 2.7 integer size:

>>> import sys
>>> sys.getsizeof(1)
24

Python 3.6 integer size:

>>> import sys
>>> sys.getsizeof(1)
28

Relative speed of indexed lookups (binary subscriptions with integer arguments indexing into a list):

$ python2.7 -m timeit -s 'a=[0]' 'a[0]'
10000000 loops, best of 3: 0.0253 usec per loop
$ python3.6 -m timeit -s 'a=[0]' 'a[0]'
10000000 loops, best of 3: 0.0313 usec per loop

Python 2.7 code in ceval.c with an optimization for indexed lookups:

    TARGET_NOARG(BINARY_SUBSCR)
    {
        w = POP();
        v = TOP();
        if (PyList_CheckExact(v) && PyInt_CheckExact(w)) {
            /* INLINE: list[int] */
            Py_ssize_t i = PyInt_AsSsize_t(w);
            if (i < 0)
                i += PyList_GET_SIZE(v);
            if (i >= 0 && i < PyList_GET_SIZE(v)) {
                x = PyList_GET_ITEM(v, i);
                Py_INCREF(x);
            }
            else
                goto slow_get;
        }
        else
          slow_get:
            x = PyObject_GetItem(v, w);
        Py_DECREF(v);
        Py_DECREF(w);
        SET_TOP(x);
        if (x != NULL) DISPATCH();
        break;
    }

Python 3.6 code in ceval.c without the optimization for indexed lookups:

    TARGET(BINARY_SUBSCR) {
        PyObject *sub = POP();
        PyObject *container = TOP();
        PyObject *res = PyObject_GetItem(container, sub);
        Py_DECREF(container);
        Py_DECREF(sub);
        SET_TOP(res);
        if (res == NULL)
            goto error;
        DISPATCH();
    }

0人赞添加讨论(0) 举报

python 2 vs python 3 performance of random, partic

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间