Hash value of NSDictionary

2019-03-15 14:53发布

问题:

I ran into an issue, where I got the same hash value for different dictionaries. Maybe I'm doing something obvious wrong, but I thought, objects with different content (a.k.a. not equal objects) should have different hash values.

NSDictionary *dictA = @{ @"foo" : @YES };
NSDictionary *dictB = @{ @"foo" : @NO };

BOOL equal = [dictA hash] == [dictB hash];

NSAssert(!equal, @"Assuming, that different dictionaries have different hash values.");

Any thoughts?

回答1:

There is no guarantee that two different objects will have different hash values.

In the latest open-source version of CoreFoundation, the hash of a CFDictionary (which is equivalent to an NSDictionary) is defined as:

static CFHashCode __CFDictionaryHash(CFTypeRef cf) {
    return __CFBasicHashHash((CFBasicHashRef)cf);
}

and __CFBasicHashHash is defined as:

__private_extern__ CFHashCode __CFBasicHashHash(CFTypeRef cf) {
    CFBasicHashRef ht = (CFBasicHashRef)cf;
    return CFBasicHashGetCount(ht);
}

which is simply the number of entries stored in the collection. In the other words, both [dictA hash] and [dictB hash]'s hash value are 1, the number of entries in the dictionaries.

While it is a very bad hash algorithm, CF didn't do anything wrong here. If you need to have a more accurate hash value, you can provide one yourself in an Obj-C category.



回答2:

With a dictionary with only integers, strings etc. I would use dict.description.hash as a quick code.



回答3:

A solution based on igor-kulagin's answer which is not order dependent:

@implementation NSDictionary (Extensions)

- (NSUInteger) hash
{
    NSUInteger prime = 31;
    NSUInteger result = 1;
    for (NSObject *key in [[self allKeys] sortedArrayUsingSelector:@selector(compare:)]) {
        result = prime * result + [key hash];
        result = prime * result + [self[key] hash];
    }
    return result;
}
@end

However, there is still a possibility of collision if the dictionary contains other dictionaries as values.



回答4:

The function 'hash' is not a real hash function. It gives different values for strings (all predictable) but for collections (arrays and dictionaries) it just returns the count. If you want a unique hash you can calculate it yourself using primes, or the functions srandom() and random() or explore a real hash function like SHA1 available in CommonCrypto/CommonDigest.h



回答5:

I created NSDictionary category and overridden hash method based on this answer: Best practices for overriding isEqual: and hash

@implementation NSDictionary (Extensions)

- (NSUInteger) hash {
    NSUInteger prime = 31;
    NSUInteger result = 1;

    NSArray *sortedKeys = [self.allKeys sortedArrayUsingSelector: @selector(compare:)];
    for (NSObject *key in sortedKeys) {
        result = prime * result + key.hash;
        id value = self[key];
        if ([value conformsToProtocol: @protocol(NSObject)] == YES) {
            result = prime * result + [value hash];
        }
    }

    return result;
}

@end

And Swift implementation.

extension Dictionary where Key: Comparable, Value: Hashable {
    public var hashValue: Int {
        let prime = 31
        var result = 1

        let sortedKeys = self.keys.sorted()
        for (key) in sortedKeys {
            let value = self[key]!
            result = Int.addWithOverflow(Int.multiplyWithOverflow(prime, result).0, key.hashValue).0
            result = Int.addWithOverflow(Int.multiplyWithOverflow(prime, result).0, value.hashValue).0
        }

        return result
    }
}

Perfectly this also requires to implement Equatable protocol for Dictionary so you can also add Hashable protocol conformance.