What is the best way to generate a set of bitarray-like objects so that I can test for membership efficiently. The naive way doesn't seem to work as I expect:
>>> from bitarray import bitarray
>>>
>>> bitarray_set = set([bitarray('0000'), bitarray('0001')])
>>> bitarray_set
set([bitarray('0001'), bitarray('0000')])
>>>
>>> bitarray('0000') in bitarray_set
False
A workaround is to keep a separate set of strings or other more friendly object as keys. Then convert a bitarray to a string and test membership against this set instead. But that seems a bit cumbersome. Is there a better solution?
It appears that bitarray
does not maintain the hash invariant:
>>> x = bitarray(b'0000')
>>> y = bitarray(b'0000')
>>> x == y
True
>>> hash(x) == hash(y)
False
This is a violation of the API for __hash__
, as documented:
The only required property is that objects which compare equal have the same hash value
This mean that bitarrays are effectively unhashable and will not work reliably in sets or as dictionary keys.
I would regard this as a bug in the bitarray library. I had never heard of bitarray before, and it doesn't seem to have much documentation. As far as I can see it doesn't even say how equality is supposed to be defined for bitarrays, nor whether they are supposed to be hashable, but it seems that it implements equality and hashing in incompatible ways.