Why don't NSSet/NSMutableSet/NSCountedSet forc

2020-05-03 00:54发布

问题:

NSDictionary keys are id<NSCopying> but the value for a set is just id, and the docs indicate their values are retained. According to the Set Fundamentals of the Collection Programming Topics docs:

You can, however, modify individual objects themselves (if they support modification).

If you modify an object, this could affect the hashvalue of the object, which would affect lookups. I assumed that an NSSet is a fast lookup?

Here's an example that shows how things break if you mutate objects:

    NSMutableString *str = [NSMutableString stringWithString: @"AWESOME"];
    NSCountedSet *countedSet = [[NSCountedSet alloc] init];
    [countedSet addObject: str];
    [countedSet addObject: str];

    NSLog(@"%@", @([countedSet countForObject: @"AWESOME"]));

    [str appendString: @" NOT AWESOME"];
    NSLog(@"%@", @([countedSet countForObject: @"AWESOME NOT AWESOME"]));
    NSLog(@"%@", @([countedSet countForObject: @"AWESOME"]));
    NSLog(@"%@", @([countedSet countForObject: str]));

    for(NSString *s in countedSet) {
        NSLog(@"%@ - %@", str, @([countedSet countForObject: s]));
    }

    NSSet *set = [NSSet setWithArray: @[ str ]];
    NSLog(@"Set Contains string, %@", @([set containsObject: str]));
    [str appendString: @"asdf"];
    NSLog(@"Set Contains string, %@", @([set containsObject: str]));
    NSLog(@"%@", set);

And output with my interpretation:

[64844:303] 2          // Count is 2
[64844:303] 0          // Count should be 2 - if it looks for the literal string
[64844:303] 0          // Count should be 0, but can't find original object either
[64844:303] 0          // Count should be 2 - asking for actual object that's in there
[64844:303] AWESOME NOT AWESOME - 0   // Should be 2 - asking for actual object that it just retrieved
[64844:303] Set Contains string, 1    // Correct, pre-mutation
[64844:303] Set Contains string, 0    // Should be true, object is in there
[65070:303] {(
    "AWESOME NOT AWESOMEasdf"   // see?  It's in there
)}

My take:

The set likely buckets based on hash value, when the hash is changed out behind the set, it doesn't know what to do and lookups are broken. The documentation is lacking in this area.

My question restated: Docs say you can mutate objects, which is not intuitive. Mutating objects breaks sets. WTF?

回答1:

That line from the docs is confusing. However, note that three paragraphs down it goes on to say:

If mutable objects are stored in a set, either the hash method of the objects shouldn’t depend on the internal state of the mutable objects or the mutable objects shouldn’t be modified while they’re in the set. For example, a mutable dictionary can be put in a set, but you must not change it while it is in there. (Note that it can be difficult to know whether or not a given object is in a collection).

What your code is demonstrating is a known property of the hash-based collection classes. It can affect dictionaries, too, if a key object is implemented such that copying returns the original, which is inherently mutable.

There's no real way to test if an object is mutable. So, it can't force immutability.

Also, as alluded to in the quote above, it's possible to make a mutable class whose hash and equality are not affected by mutations.

Finally, it would too severely limit the utility of those collection classes if they could only be used with copyable classes and made copies of the elements (like dictionaries make copies of their keys). The collections are used to represent relationships, among other things, and it wouldn't do if you tried to establish a relationship between objects but instead established a relationship to a separate copy.



回答2:

Since the only reliable way of ensuring an object's immutability in Objective-C is to make a copy, Cocoa designers had two choices:

  • Make NSSet copy the objects - That would be safe, bit it would severely restrict the use of NSSet due to increased memory usage.
  • Use retained objects - That would keep memory usage to a bare minimum, but it would give the users a way to shoot themselves in a foot by mutating an object inside NSSet.

Designers picked the second approach over the first one, because it fixes a danger that could be avoided by proper coding technique. In contrast, selecting the first approach would be "binding" on everybody, in the sense that inserting a new object would always make a copy.

Currently, users have a choice of inserting copies of objects that they create manually, thus emulating the first approach. However, an implementation that forces a copy cannot emulate an implementation that retains objects, making it a less flexible choice.