Why is hash() slower under python3.4 vs python2.7

2019-02-17 17:16发布

I was doing some performance evaluation using timeit and discovered a performance degredation between python 2.7.10 and python 3.4.3. I narrowed it down to the hash() function:

python 2.7.10:

>>> import timeit
>>> timeit.timeit('for x in xrange(100): hash(x)', number=100000)
0.4529099464416504
>>> timeit.timeit('hash(1000)')
0.044638872146606445

python 3.4.3:

>>> import timeit
>>> timeit.timeit('for x in range(100): hash(x)', number=100000)
0.6459149940637872
>>> timeit.timeit('hash(1000)')
0.07708719989750534

That's an approx. 40% degradation! It doesn't seem to matter if integers, floats, strings(unicodes or bytearrays), etc, are being hashed; the degradation is about the same. In both cases the hash is returning a 64-bit integer. The above was run on my Mac, and got a smaller degradation (20%) on an Ubuntu box.

I've also used PYTHONHASHSEED=random for the python2.7 tests and in some cases, restarting python for each "case", I saw the hash() performance get a bit worse, but never as slow as python3.4

Anyone know what's going on here? Was a more-secure, but slower, hash function chosen for python3 ?

标签： python python-3.4

1条回答

再贱就再见

2楼-- · 2019-02-17 17:37

There are two changes in hash() function between Python 2.7 and Python 3.4

Adoptions of SipHash
Default enabling of Hash randomization

References:

Since from Python 3.4, it uses SipHash for it's hashing function. Read: Python adopts SipHash
Since Python 3.3 Hash randomization is enabled by default. Reference: object.__hash__ (last line of this section). Specifying PYTHONHASHSEED the value 0 will disable hash randomization.

0人赞添加讨论(0) 举报

Why is hash() slower under python3.4 vs python2.7

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间