Borrowing the documentation from the __contains__
documentation
print set.__contains__.__doc__
x.__contains__(y) <==> y in x.
This seems to work fine for primitive objects such as int, basestring, etc. But for user-defined objects that define the __ne__
and __eq__
methods, I get unexpected behavior. Here is a sample code:
class CA(object):
def __init__(self,name):
self.name = name
def __eq__(self,other):
if self.name == other.name:
return True
return False
def __ne__(self,other):
return not self.__eq__(other)
obj1 = CA('hello')
obj2 = CA('hello')
theList = [obj1,]
theSet = set(theList)
# Test 1: list
print (obj2 in theList) # return True
# Test 2: set weird
print (obj2 in theSet) # return False unexpected
# Test 3: iterating over the set
found = False
for x in theSet:
if x == obj2:
found = True
print found # return True
# Test 4: Typcasting the set to a list
print (obj2 in list(theSet)) # return True
So is this a bug or a feature?
A
set
hashes it's elements to allow a fast lookup. You have to overwrite the__hash__
method so that a element can be found:Lists don't use hashing, but compare each element like your
for
loop does.This is because
CA
doesn't implement__hash__
A sensible implementation would be:
For
set
s anddicts
, you need to define__hash__
. Any two objects that are equal should hash the same in order to get consistent / expected behavior inset
s anddicts
.I would reccomend using a
_key
method, and then just referencing that anywhere you need the part of the item to compare, just as you call__eq__
from__ne__
instead of reimplementing it: