Positive integer from Python hash() function

2019-04-18 11:23发布

问题:

I want to use the Python hash() function to get integer hashes from objects. But built-in hash() can give negative values, and I want only positive. And I want it to work sensibly on both 32-bit and 64-bit platforms.

I.e. on 32-bit Python, hash() can return an integer in the range -2**31 to 2**31 - 1. On 64-bit systems, hash() can return an integer in the range -2**63 to 2**63 - 1.

But I want a hash in the range 0 to 2**32-1 on 32-bit systems, and 0 to 2**64-1 on 64-bit systems.

What is the best way to convert the hash value to its equivalent positive value within the range of the 32- or 64-bit target platform?

(Context: I'm trying to make a new random.Random style class. According to the random.Random.seed() docs, the seed "optional argument x can be any hashable object." So I'd like to duplicate that functionality, except that my seed algorithm can't handle negative integer values, only positive.)

回答1:

Using sys.maxsize:

>>> import sys
>>> sys.maxsize
9223372036854775807L
>>> hash('asdf')
-618826466
>>> hash('asdf') % ((sys.maxsize + 1) * 2)
18446744073090725150L

Alternative using ctypes.c_size_t:

>>> import ctypes
>>> ctypes.c_size_t(hash('asdf')).value
18446744073090725150L


回答2:

Just using sys.maxsize is wrong for obvious reasons (it being `2*n-1 and not 2*n), but the fix is easy enough:

h = hash(obj)
h += sys.maxsize + 1

for performance reasons you may want to split the sys.maxsize + 1 into two separate assignments to avoid creating a long integer temporarily for most negative numbers. Although I doubt this is going to matter much



回答3:

How about:

h = hash(o)
if h < 0:
  h += sys.maxsize

This uses sys.maxsize to be portable between 32- and 64-bit systems.



回答4:

(Edit: at first I thought you always wanted a 32-bit value)

Simply AND it with a mask of the desired size. Generally sys.maxsize will already be such a mask, since it's a power of 2 minus 1.

import sys
assert (sys.maxsize & (sys.maxsize+1)) == 0 # checks that maxsize+1 is a power of 2 

new_hash = hash & sys.maxsize