Why aren't Python sets hashable?

2019-01-08 13:05发布

问题:

I stumbled across a blog post detailing how to implement a powerset function in Python. So I went about trying my own way of doing it, and discovered that Python apparently cannot have a set of sets, since set is not hashable. This is irksome, since the definition of a powerset is that it is a set of sets, and I wanted to implement it using actual set operations.

>>> set([ set() ])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'

Is there a good reason Python sets are not hashable?

回答1:

Generally, only immutable objects are hashable in Python. The immutable variant of set() -- frozenset() -- is hashable.



回答2:

Because they're mutable.

If they were hashable, a hash could silently become "invalid", and that would pretty much make hashing pointless.



回答3:

From the Python docs:

hashable
An object is hashable if it has a hash value which never changes during its lifetime (it needs a hash() method), and can be compared to other objects (it needs an eq() or cmp() method). Hashable objects which compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

All of Python’s immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal, and their hash value is their id().



回答4:

In case this helps... if you really need to convert unhashable things into hashable equivalents for some reason you might do something like this:

from collections import Hashable, MutableSet, MutableSequence, MutableMapping

def make_hashdict(value):
    """
    Inspired by https://stackoverflow.com/questions/1151658/python-hashable-dicts
     - with the added bonus that it inherits from the dict type of value
       so OrderedDict's maintain their order and other subclasses of dict() maintain their attributes
    """
    map_type = type(value)

    class HashableDict(map_type):
        def __init__(self, *args, **kwargs):
            super(HashableDict, self).__init__(*args, **kwargs)
        def __hash__(self):
            return hash(tuple(sorted(self.items())))

    hashDict = HashableDict(value)

    return hashDict


def make_hashable(value):
    if not isinstance(value, Hashable):
        if isinstance(value, MutableSet):
            value = frozenset(value)
        elif isinstance(value, MutableSequence):
            value = tuple(value)
        elif isinstance(value, MutableMapping):
            value = make_hashdict(value)

        return value

my_set = set()
my_set.add(make_hashable(['a', 'list']))
my_set.add(make_hashable({'a': 1, 'dict': 2}))
my_set.add(make_hashable({'a', 'new', 'set'}))

print my_set

My HashableDict implementation is the simplest and least rigorous example from here. If you need a more advanced HashableDict that supports pickling and other things, check the many other implementations. In my version above I wanted to preserve the original dict class, thus preserving the order of OrderedDicts. I also use AttrDict from here for attribute-like access.

My example above is not in any way authoritative, just my solution to a similar problem where I needed to store some things in a set and needed to "hashify" them first.



标签: python hash set