From a recent SO question (see Create a dictionary in python which is indexed by lists) I realized I probably had a wrong conception of the meaning of hashable and immutable objects in python.
- What does hashable mean in practice?
- What is the relation between hashable and immmutable?
- Are there mutable objects that are hashable or immutable objects that are not hashable?
just because this is the top Google hit, here's a simple way to make a mutable object hashable:
I actually found a use for something like this when creating a class to cast SQLAlchemy records into something mutable and more useful to me, while maintaining their hashability for use as dict keys.
From the Python Glossary:
Dicts and sets must use a hash for efficient lookup in a hash table; the hash values must be immutable, because changing the hash will mess up the data structures and cause the dict or set to fail. The easiest way to make the hash value immutable is to make the whole object immutable, which is why the two are often mentioned together.
While none of the built-in mutable objects are hashable, it is possible to make a mutable object with a hash value that's not mutable. It's common for only a portion of the object to represent its identity, while the rest of the object contains properties that are free to change. As long as the hash value and comparison functions are based on the identity but not the mutable properties, and the identity never changes, you've met the requirements.
In Python, tuple is immutable, but it is hashable only if all its elements are hashable.
Hashable Types
In Python they're mostly interchangeable; since the hash is supposed to represent the content, so it's just as mutable as the object, and having an object change the hash value would make it unusable as a dict key.
In other languages, the hash value is more related to the objects 'identity', and not (necessarily) to the value. Thus, for a mutable object, the pointer could be used to start the hashing. Assuming, of course, that an object doesn't move in memory (as some GC do). This is the approach used in Lua, for example. This makes a mutable object usable as a table key; but creates several (unpleasant) surprises for newbies.
In the end, having an immutable sequence type (tuples) makes it nicer for 'multi-value keys'.
Hashing is the process of converting some large amount of data into a much smaller amount (typically a single integer) in a repeatable way so that it can be looked up in a table in constant-time (
O(1)
), which is important for high-performance algorithms and data structures.Immutability is the idea that an object will not change in some important way after it has been created, especially in any way that might change the hash value of that object.
The two ideas are related because objects which are used as hash keys must typically be immutable so their hash value doesn't change. If it was allowed to change then the location of that object in a data structure such as a hashtable would change and then the whole purpose of hashing for efficiency is defeated.
To really grasp the idea you should try to implement your own hashtable in a language like C/C++, or read the Java implementation of the
HashMap
class.Technically, hashable means that the class defines __hash__(). According to the docs:
I think that for the python builtin types, all hashable types are also immutable. It would be difficult but perhaps not impossible to have a mutable object that nonetheless defined __hash__().