The documentation for -hash
says it must not change while a mutable object is stored in a collection, and similarly the documentation for -isEqual:
says the -hash
value must be the same for equal objects.
Given this, does anybody have any suggestions for the best way to implement -hash
such that it meets both these conditions and yet is actually calculated intelligently (i.e. doesn't just return 0
)? Does anybody know how the mutable versions of framework-provided classes do this?
The simplest thing to do is of course just forget the first condition (about it not changing) and just make sure I never accidentally mutate an object while it's in a collection, but I'm wondering if there's any solution that's more flexible.
EDIT: I'm wondering here whether it's possible to maintain the 2 contracts (where equal objects have equal hashes, and hashes don't change while the object is in a collection) when I'm mutating the internal state of the object. My inclination is to say "no", unless I do something stupid like always return 0 for the hash, but that's why I'm asking this question.
Interesting question, but I think what you want is logically impossible. Say you start with 2 objects, A and B. They're both different, and they start with different hash codes. You add both to some hash table. Now, you want to mutate A, but you can't change the hash code because it's already in the table. However, it's possible to change A in such a way that it .equals() B.
In this case, you have 2 choices, neither of which works:
- Change the hashcode of A to equal B.hashcode, which violates the constraint of not changing hash codes while in a hash table.
- Don't change the hashcode, in which case A.equals(B) but they don't have the same hashcodes.
It seems to me that there's no possible way to do this without using a constant as a hashcode.
My reading of the documentation is that a mutable object's value for hash
can (and probably should) change when it is mutated, but should not change when the object hasn't been mutated. The portion of the documentation to which to refer, therefore, is saying, "Don't mutate objects that are stored in a collection, because that will cause their hash
value to change."
To quote directly from the NSObject documentation for hash
:
If a mutable object is added to a
collection that uses hash values to
determine the object’s position in the
collection, the value returned by the
hash method of the object must not
change while the object is in the
collection. Therefore, either the hash
method must not rely on any of the
object’s internal state information or
you must make sure the object’s
internal state information does not
change while the object is in the
collection.
(Emphasis mine.)
The question here isn't how to meet both of these requirements, but rather which one you should meet. In Apple's documentation, it is clearly stated that:
a mutable dictionary can be put in a hash table but you must not change it while it is in there.
This being said, it seems more important that you meet the equality requirement of hashes. The hash of an object should always be a way to check if an object is equal to another. If this is ever not the case, it is not a true hash function.
Just to finish up my answer, I'll give an example of a good hash implementation. Let's say you are writing the implementation of -hash
on a collection that you have created. This collection stores an array of NSObjects as pointers. Since all NSObjects implement the hash function, you can use their hashes in calculating the collection's hash:
- (NSUInteger)hash {
NSUInteger theHash = 0;
for (NSObject * aPtr in self) { // fast enumeration
theHash ^= [aPtr hash];
}
return theHash;
}
This way, two collection objects containing the same pointers (in the same order) will have the same hash.
Since you are already overriding -isEqual: to do a value-based comparison, are you sure you really need to bother with -hash?
I can't guess what exactly you need this for of course, but if you want to do value-based comparison without deviating from the expected implementation of -isEqual: to only return YES when hashes are identical, a better approach might be to mimick NSString's -isEqualToString:, so to create your own -isEqualToFoo: method instead of using or overriding -isEqual:.
The answer to this question and the key to avoiding many cocoa-bugs is this:
Read the documentation carefully. Place every word and punctuation on a golden scale and weight it as it was the world's last grain of wheat.
Let's read the documentation again:
If a mutable object is added to a collection that uses hash values to determine the object’s position in the collection, [...]
(emphasis mine).
What the writer of the docs, in his/hers eternal wisdom, mean by this is that when you are implementing a collection, like a dictionary, you shouldn't use the hash for positioning since that can change. In other words it has little to do with implementing -hash on mutable Cocoa objects (which all of us thought it had, assuming the documentation has not changed in the last ~10 years since the question was asked).
That is why dictionaries always copy their keys - so they can guarantee
that the hash value won't change.
You will then ask the question: But, good sir, how does NSMapTable and similar handle this?
The answer to this is according to the documentation:
"Its keys or values may be copied on input or may use pointer identity for equality and hashing."
(emphasis mine again).
Since we were so easily fooled by the documentation last time, let's run a little experiment to see for ourselves how stuff actually work:
NSMutableString *string = [NSMutableString stringWithString:@"so lets mutate this"];
NSString *originalString = string.copy;
NSMapTable *mutableStrings = [NSMapTable strongToStrongObjectsMapTable];
[mutableStrings setObject:originalString forKey:string];
[string appendString:@" into a larger string"];
if ([mutableStrings objectForKey:string] == nil)
NSLog(@"not found!");
if ([mutableStrings objectForKey:originalString] == nil)
NSLog(@"Not even the original string is found?");
for (NSString *inCollection in mutableStrings)
{
NSLog(@"key '%@' : is '%@' (null)", inCollection, [mutableStrings objectForKey:inCollection]);
}
for (NSString *value in NSAllMapTableValues(mutableStrings))
{
NSLog(@"value exists: %@", value);
}
Surprise!
So, instead of using pointer equality, they focus on the words "may" here which in this case mean "may not", and simply copy the hash value when adding stuff to the collection.
(All this is actually good, since it would be quite difficult to implement NSHashMap, or -hash, otherwise).
In Java, most mutable classes simply don’t override Object.hashCode() so that the default implementation returns a value that is based on the address of the object and doesn’t change. It might just be the same with Objective C.