Why are immutable objects in hashmaps so effective

2019-01-17 00:04发布

So I read about HashMap. At one point it was noted:

"Immutability also allows caching the hashcode of different keys which makes the overall retrieval process very fast and suggest that String and various wrapper classes (e.g., Integer) provided by Java Collection API are very good HashMap keys."

I don't quite understand... why?

6条回答
爱情/是我丢掉的垃圾
2楼-- · 2019-01-17 00:12

Think of the hashmap as a big array of numbered boxes. The number is the hashcode, and the boxes are ordered by number.

Now if the object can't change, the hash function will always reproduce the same value. Therefore the object will always stay in it's box.

Now suppose a changeable object. It is changed after adding it to the hash, so now it is sitting in the wrong box, like a Mrs. Jones which happened to marry Mister Doe, and which is now named Doe too, but in many registers still named Jones.

查看更多
ゆ 、 Hurt°
3楼-- · 2019-01-17 00:13

Basically immutability is achieved in Java by making the class not extendable and all the operations in the object will ideally not change the state of the object. If you see the operations of String like replace(), it does not change the state of the current object with which you are manipulating rather it gives you a new String object with the replaced string. So ideally if you maintain such objects as keys the state doesn't change and hence the hash code also remains unchanged. So caching the hash code will be performance effective during retrievals.

查看更多
我欲成王,谁敢阻挡
4楼-- · 2019-01-17 00:15

Quoting the linked blog entry:

final object with proper equals () and hashcode () implementation would act as perfect Java HashMap keys and improve performance of Java hashMap by reducing collision.

I fail to see how both final and equals() have anything to do with hash collisions. This sentence raises my suspicion about the credibility of the article. It seems to be a collection of dogmatic Java "wisdoms".

Immutability also allows caching there hashcode of different keys which makes overall retrieval process very fast and suggest that String and various wrapper classes e.g Integer provided by Java Collection API are very good HashMap keys.

I see two possible interpretations of this sentence, both of which are wrong:

  • HashMap caches hash codes of immutable objects. This is not correct. The map doesn't have the possibility to find out if an object is "immutable".
  • Immutability is required for an object to cache its own hash code. Ideally, an object's hash value should always just rely on non-mutating state of the object, otherwise the object couldn't be sensibly used as a key. So in this case, too, the author fails to make a point: If we assume that our object is not changing its state, we also don't have to recompute the hash value every time, even if our object is mutable!

Example

So if we are really crazy and actually decide to use a List as a key for a HashMap and make the hash value dependent on the contents, rather than the identity of the list, we could just decide to invalidate the cached hash value on every modification, thus limiting the number of hash computations to the number of modifications to the list.

查看更多
相关推荐>>
5楼-- · 2019-01-17 00:19

Hash tables will only work if the hash code of an object can never change while it is stored in the table. This implies that the hash code cannot take into account any aspect of the object which could change while it's in the table. If the most interesting aspects of an object are mutable, that implies that either:

  • The hash code will have to ignore most of the interesting aspects of the object, thus causing many hash collisions, or...

  • The code which owns the hash table will have to ensure that the objects therein are not exposed to anything that might change them while they are stored in the hash table.

If Java hash tables allowed clients to supply an EqualityComparer (the way .NET dictionaries do), code which knows that certain aspects of the objects in a hash table won't unexpectedly change could use a hash code which took those aspects into account, but the only way to accomplish that in Java would be to wrap each item stored in the hashcode in a wrapper. Such wrapping may not be the most evil thing in the world, however, since the wrapper would be able to cache hash values in a way which an EqualityComparer could not, and could also cache further equality-related information [e.g. if the things being stored were nested collections, it might be worthwhile to compute multiple hash codes, and confirm that all hash codes match before doing any detailed inspection of the elements].

查看更多
可以哭但决不认输i
6楼-- · 2019-01-17 00:32

It's very simple. Since an immutable object doesn't change over time, it only needs to perform the calculation of the hash code once. Calculating it again will yield the same value. Therefore it is common to calculate the hash code in the constructor (or lazily) and store it in a field. The hashcode function then returns just the value of the field, which is indeed very fast.

查看更多
Viruses.
7楼-- · 2019-01-17 00:35

String#hashCode:

private int hash;

...

public int hashCode() {
    int h = hash;
    if (h == 0 && count > 0) {
        int off = offset;
        char val[] = value;
        int len = count;

        for (int i = 0; i < len; i++) {
            h = 31*h + val[off++];
        }
        hash = h;
    }
    return h;
}

Since the contents of a String never change, the makers of the class chose to cache the hash after it had been calculated once. This way, time is not wasted recalculating the same value.

查看更多
登录 后发表回答