I have a Map which is to be modified by several threads concurrently.
There seem to be three different synchronized Map implementations in the Java API:
Hashtable
Collections.synchronizedMap(Map)
ConcurrentHashMap
From what I understand, Hashtable
is an old implementation (extending the obsolete Dictionary
class), which has been adapted later to fit the Map
interface. While it is synchronized, it seems to have serious scalability issues and is discouraged for new projects.
But what about the other two? What are the differences between Maps returned by Collections.synchronizedMap(Map)
and ConcurrentHashMap
s? Which one fits which situation?
Besides what has been suggested, I'd like to post the source code related to
SynchronizedMap
.To make a
Map
thread safe, we can useCollections.synchronizedMap
statement and input the map instance as the parameter.The implementation of
synchronizedMap
inCollections
is like belowAs you can see, the input
Map
object is wrapped by theSynchronizedMap
object.Let's dig into the implementation of
SynchronizedMap
,What
SynchronizedMap
does can be summarized as adding a single lock to primary method of the inputMap
object. All method guarded by the lock can't be accessed by multiple threads at the same time. That means normal operations likeput
andget
can be executed by a single thread at the same time for all data in theMap
object.It makes the
Map
object thread safe now but the performance may become an issue in some scenarios.The
ConcurrentMap
is far more complicated in the implementation, we can refer to Building a better HashMap for details. In a nutshell, it's implemented taking both thread safe and performance into consideration.The "scalability issues" for
Hashtable
are present in exactly the same way inCollections.synchronizedMap(Map)
- they use very simple synchronization, which means that only one thread can access the map at the same time.This is not much of an issue when you have simple inserts and lookups (unless you do it extremely intensively), but becomes a big problem when you need to iterate over the entire Map, which can take a long time for a large Map - while one thread does that, all others have to wait if they want to insert or lookup anything.
The
ConcurrentHashMap
uses very sophisticated techniques to reduce the need for synchronization and allow parallel read access by multiple threads without synchronization and, more importantly, provides anIterator
that requires no synchronization and even allows the Map to be modified during interation (though it makes no guarantees whether or not elements that were inserted during iteration will be returned).The main difference between these two is that
ConcurrentHashMap
will lock only portion of the data which are being updated while other portion of data can be accessed by other threads. However,Collections.synchronizedMap()
will lock all the data while updating, other threads can only access the data when the lock is released. If there are many update operations and relative small amount of read operations, you should chooseConcurrentHashMap
.Also one other difference is that
ConcurrentHashMap
will not preserve the order of elements in the Map passed in. It is similar toHashMap
when storing data. There is no guarantee that the element order is preserved. WhileCollections.synchronizedMap()
will preserve the elements order of the Map passed in. For example, if you pass aTreeMap
toConcurrentHashMap
, the elements order in theConcurrentHashMap
may not be the same as the order in theTreeMap
, butCollections.synchronizedMap()
will preserve the order.Furthermore,
ConcurrentHashMap
can guarantee that there is noConcurrentModificationException
thrown while one thread is updating the map and another thread is traversing the iterator obtained from the map. However,Collections.synchronizedMap()
is not guaranteed on this.There is one post which demonstrate the differences of these two and also the
ConcurrentSkipListMap
.Regarding locking mechanism:
Hashtable
locks the object, whileConcurrentHashMap
locks only the bucket.In general, if you want to use the
ConcurrentHashMap
make sure you are ready to miss 'updates'(i.e. printing contents of the HashMap does not ensure it will print the up-to-date Map) and use APIs like
CyclicBarrier
to ensure consistency across your program's lifecycle.In
ConcurrentHashMap
, the lock is applied to a segment instead of an entire Map. Each segment manages its own internal hash table. The lock is applied only for update operations.Collections.synchronizedMap(Map)
synchronizes the entire map.