Double checked locking with regular HashMap

2019-03-27 23:56发布

问题:

Back to concurrency. By now it is clear that for the double checked locking to work the variable needs to be declared as volatile. But then what if double checked locking is used as below.

class Test<A, B> {

    private final Map<A, B> map = new HashMap<>();

    public B fetch(A key, Function<A, B> loader) {
        B value = map.get(key);
        if (value == null) {
            synchronized (this) {
                value = map.get(key);
                if (value == null) {
                    value = loader.apply(key);
                    map.put(key, value);
                }
            }
        }
        return value;
    }

}

Why does it really have to be a ConcurrentHashMap and not a regular HashMap? All map modification is done within the synchronized block and the code doesn't use iterators so technically there should be no "concurrent modification" problems.

Please avoid suggesting the use of putIfAbsent/computeIfAbsent as I am asking about the concept and not the use of API :) unless using this API contributes to HashMap vs ConcurrentHashMap subject.

Update 2016-12-30

This question was answered by a comment below by Holger "HashMap.get doesn’t modify the structure, but your invocation of put does. Since there is an invocation of get outside of the synchronized block, it can see an incomplete state of a put operation happening concurrently." Thanks!

回答1:

This question is muddled on so many counts that its hard to answer.

If this code is only ever called from a single thread, then you're making it too complicated; you don't need any synchronization. But clearly that's not your intention.

So, multiple threads will call the fetch method, which delegates to HashMap.get() without any synchronization. HashMap is not thread-safe. Bam, end of story. Doesn't even matter if you're trying to simulate double-checked locking; the reality is that calling get() and put() on a map will manipulate the internal mutable data structures of the HashMap, without consistent synchronization on all code paths, and since you can be calling these concurrently from multiple threads, you're already dead.

(Also, you probably think that HashMap.get() is a pure read operation, but that's wrong too. What if the HashMap is actually a LinkedHashMap (which is a subclass of HashMap.) LinkedHashMap.get() will update the access order, which involves writing to internal data structures -- here, concurrently without synchronization. But even if get() is doing no writing, your code here is still broken.)

Rule of thumb: when you think you have a clever trick that lets you avoid synchronizing, you're almost certainly wrong.