I have a piece of code that can be executed by multiple threads that needs to perform an I/O-bound operation in order to initialize a shared resource that is stored in a ConcurrentMap
. I need to make this code thread safe and avoid unnecessary calls to initialize the shared resource. Here's the buggy code:
private ConcurrentMap<String, Resource> map;
// .....
String key = "somekey";
Resource resource;
if (map.containsKey(key)) {
resource = map.get(key);
} else {
resource = getResource(key); // I/O-bound, expensive operation
map.put(key, resource);
}
With the above code, multiple threads may check the ConcurrentMap
and see that the resource isn't there, and all attempt to call getResource()
which is expensive. In order to ensure only a single initialization of the shared resource and to make the code efficient once the resource has been initialized, I want to do something like this:
String key = "somekey";
Resource resource;
if (!map.containsKey(key)) {
synchronized (map) {
if (!map.containsKey(key)) {
resource = getResource(key);
map.put(key, resource);
}
}
}
Is this a safe version of double checked locking? It seems to me that since the checks are called on ConcurrentMap
, it behaves like a shared resource that is declared to be volatile
and thus prevents any of the "partial initialization" problems that may happen.
Why not use the putIfAbsent() method on ConcurrentMap?
Conceivably you might call getResource() more than once, but it won't happen a bunch of times. Simpler code is less likely to bite you.
No need for that - ConcurrentMap supports this as with its special atomic putIfAbsent method.
Don't reinvent the wheel: Always use the API where possible.
If you can use external libraries, take a look at Guava's MapMaker.makeComputingMap(). It's tailor-made for what you're trying to do.
In general, double-checked locking is safe if the variable you're synchronizing on is marked volatile. But you're better off synchronizing the entire function:
The performance hit will be tiny, and you'll be certain that there will be no sync problems.
Edit:
This is actually faster than the alternatives, because you won't have to do two calls to the map in most cases. The only extra operation is the null check, and the cost of that is close to zero.
Second edit:
Also, you don't have to use ConcurrentMap. A regular HashMap will do it. Faster still.
yes it' safe.
If
map.containsKey(key)
is true, according to doc,map.put(key, resource)
happens before it. ThereforegetResource(key)
happens beforeresource = map.get(key)
, everything is safe and sound.The verdict is in. I timed 3 different solutions in nanosecond accuracy, since after all the initial question was about performance:
Fully synching the function on a regular HashMap:
first invocation: 15,000 nanoseconds, subsequent invocations: 700 nanoseconds
Using the double check lock with a ConcurrentHashMap:
first invocation: 15,000 nanoseconds, subsequent invocations: 1500 nanoseconds
A different flavor of double checked ConcurrentHashMap:
first invocation: 15,000 nanoseconds, subsequent invocations: 1000 nanoseconds
You can see that the biggest cost was on the first invocation, but was similar for all 3. Subsequent invocations were the fastest on the regular HashMap with method sync like user237815 suggested but only by 300 NANO seocnds. And after all we are talking about NANO seconds here which means a BILLIONTH of a second.