What's the difference between ConcurrentHashMa

2018-12-31 12:29发布

站内文章 / Java

45 0

大哥的爱人

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a Map which is to be modified by several threads concurrently.

There seem to be three different synchronized Map implementations in the Java API:

Hashtable
Collections.synchronizedMap(Map)
ConcurrentHashMap

From what I understand, Hashtable is an old implementation (extending the obsolete Dictionary class), which has been adapted later to fit the Map interface. While it is synchronized, it seems to have serious scalability issues and is discouraged for new projects.

But what about the other two? What are the differences between Maps returned by Collections.synchronizedMap(Map) and ConcurrentHashMaps? Which one fits which situation?

回答1:

For your needs, use ConcurrentHashMap. It allows concurrent modification of the Map from several threads without the need to block them. Collections.synchronizedMap(map) creates a blocking Map which will degrade performance, albeit ensure consistency (if used properly).

Use the second option if you need to ensure data consistency, and each thread needs to have an up-to-date view of the map. Use the first if performance is critical, and each thread only inserts data to the map, with reads happening less frequently.

回答2:

╔═══════════════╦═══════════════════╦═══════════════════╦═════════════════════╗
║   Property    ║     HashMap       ║    Hashtable      ║  ConcurrentHashMap  ║
╠═══════════════╬═══════════════════╬═══════════════════╩═════════════════════╣ 
║      Null     ║     allowed       ║              not allowed                ║
║  values/keys  ║                   ║                                         ║
╠═══════════════╬═══════════════════╬═════════════════════════════════════════╣
║Is thread-safe ║       no          ║                  yes                    ║
╠═══════════════╬═══════════════════╬═══════════════════╦═════════════════════╣
║     Lock      ║       not         ║ locks the whole   ║ locks the portion   ║        
║  mechanism    ║    applicable     ║       map         ║                     ║ 
╠═══════════════╬═══════════════════╩═══════════════════╬═════════════════════╣
║   Iterator    ║               fail-fast               ║ weakly consistent   ║ 
╚═══════════════╩═══════════════════════════════════════╩═════════════════════╝

Regarding locking mechanism: Hashtable locks the object, while ConcurrentHashMap locks only the bucket.

回答3:

The \"scalability issues\" for Hashtable are present in exactly the same way in Collections.synchronizedMap(Map) - they use very simple synchronization, which means that only one thread can access the map at the same time.

This is not much of an issue when you have simple inserts and lookups (unless you do it extremely intensively), but becomes a big problem when you need to iterate over the entire Map, which can take a long time for a large Map - while one thread does that, all others have to wait if they want to insert or lookup anything.

The ConcurrentHashMap uses very sophisticated techniques to reduce the need for synchronization and allow parallel read access by multiple threads without synchronization and, more importantly, provides an Iterator that requires no synchronization and even allows the Map to be modified during interation (though it makes no guarantees whether or not elements that were inserted during iteration will be returned).

回答4:

ConcurrentHashMap is preferred when you can use it - though it requires at least Java 5.

It is designed to scale well when used by multiple threads. Performance may be marginally poorer when only a single thread accesses the Map at a time, but significantly better when multiple threads access the map concurrently.

I found a blog entry that reproduces a table from the excellent book Java Concurrency In Practice, which I thoroughly recommend.

Collections.synchronizedMap makes sense really only if you need to wrap up a map with some other characteristics, perhaps some sort of ordered map, like a TreeMap.

回答5:

The main difference between these two is that ConcurrentHashMap will lock only portion of the data which are being updated while other portion of data can be accessed by other threads. However, Collections.synchronizedMap() will lock all the data while updating, other threads can only access the data when the lock is released. If there are many update operations and relative small amount of read operations, you should choose ConcurrentHashMap.

Also one other difference is that ConcurrentHashMap will not preserve the order of elements in the Map passed in. It is similar to HashMap when storing data. There is no guarantee that the element order is preserved. While Collections.synchronizedMap() will preserve the elements order of the Map passed in. For example, if you pass a TreeMap to ConcurrentHashMap, the elements order in the ConcurrentHashMap may not be the same as the order in the TreeMap, but Collections.synchronizedMap() will preserve the order.

Furthermore, ConcurrentHashMap can guarantee that there is no ConcurrentModificationException thrown while one thread is updating the map and another thread is traversing the iterator obtained from the map. However, Collections.synchronizedMap() is not guaranteed on this.

There is one post which demonstrate the differences of these two and also the ConcurrentSkipListMap.

回答6:

In ConcurrentHashMap, the lock is applied to a segment instead of an entire Map. Each segment manages its own internal hash table. The lock is applied only for update operations. Collections.synchronizedMap(Map) synchronizes the entire map.

回答7:

Hashtable and ConcurrentHashMap do not allow null keys or null values.
Collections.synchronizedMap(Map) synchronizes all operations (get, put, size, etc).
ConcurrentHashMap supports full concurrency of retrievals, and adjustable expected concurrency for updates.

As usual, there are concurrency--overhead--speed tradeoffs involved. You really need to consider the detailed concurrency requirements of your application to make a decision, and then test your code to see if it\'s good enough.

回答8:

You are right about HashTable, you can forget about it.

Your article mentions the fact that while HashTable and the synchronized wrapper class provide basic thread-safety by only allowing one thread at a time to access the map, this is not \'true\' thread-safety since many compound operations still require additional synchronization, for example:

synchronized (records) {
  Record rec = records.get(id);
  if (rec == null) {
      rec = new Record(id);
      records.put(id, rec);
  }
  return rec;
}

However, don\'t think that ConcurrentHashMap is a simple alternative for a HashMap with a typical synchronized block as shown above. Read this article to understand its intricacies better.

回答9:

Here are few :

1) ConcurrentHashMap locks only portion of Map but SynchronizedMap locks whole MAp.
2) ConcurrentHashMap has better performance over SynchronizedMap and more scalable.
3) In case of multiple reader and Single writer ConcurrentHashMap is best choice.

This text is from Difference between ConcurrentHashMap and hashtable in Java

回答10:

We can achieve thread safety by using ConcurrentHashMap and synchronisedHashmap and Hashtable. But there is a lot of difference if you look at their architecture.

synchronisedHashmap and Hashtable

Both will maintain the lock at the object level. So if you want to perform any operation like put/get then you have to acquire the lock first. At the same time, other threads are not allowed to perform any operation. So at a time, only one thread can operate on this. So the waiting time will increase here. We can say that performance is relatively low when you comparing with ConcurrentHashMap.

ConcurrentHashMap

It will maintain the lock at segment level. It has 16 segments and maintains the concurrency level as 16 by default. So at a time, 16 threads can be able to operate on ConcurrentHashMap. Moreover, read operation doesn\'t require a lock. So any number of threads can perform a get operation on it.

If thread1 wants to perform put operation in segment 2 and thread2 wants to perform put operation on segment 4 then it is allowed here. Means, 16 threads can perform update(put/delete) operation on ConcurrentHashMap at a time.

So that the waiting time will be less here. Hence the performance is relatively better than synchronisedHashmap and Hashtable.

回答11:

ConcurrentHashMap

You should use ConcurrentHashMap when you need very high concurrency in your project.
It is thread safe without synchronizing the whole map.
Reads can happen very fast while write is done with a lock.
There is no locking at the object level.
The locking is at a much finer granularity at a hashmap bucket level.
ConcurrentHashMap doesn’t throw a ConcurrentModificationException if one thread tries to modify it while another is iterating over it.
ConcurrentHashMap uses multitude of locks.

SynchronizedHashMap

Synchronization at Object level.
Every read/write operation needs to acquire lock.
Locking the entire collection is a performance overhead.
This essentially gives access to only one thread to the entire map & blocks all the other threads.
It may cause contention.
SynchronizedHashMap returns Iterator, which fails-fast on concurrent modification.

source

回答12:

Synchronized Map:

Synchronized Map is also not very different than Hashtable and provides similar performance in concurrent Java programs. Only difference between Hashtable and SynchronizedMap is that SynchronizedMap is not a legacy and you can wrap any Map to create it’s synchronized version by using Collections.synchronizedMap() method.

ConcurrentHashMap:

The ConcurrentHashMap class provides a concurrent version of the standard HashMap. This is an improvement on the synchronizedMap functionality provided in the Collections class.

Unlike Hashtable and Synchronized Map, it never locks whole Map, instead it divides the map in segments and locking is done on those. It perform better if number of reader threads are greater than number of writer threads.

ConcurrentHashMap by default is separated into 16 regions and locks are applied. This default number can be set while initializing a ConcurrentHashMap instance. When setting data in a particular segment, the lock for that segment is obtained. This means that two updates can still simultaneously execute safely if they each affect separate buckets, thus minimizing lock contention and so maximizing performance.

ConcurrentHashMap doesn’t throw a ConcurrentModificationException

ConcurrentHashMap doesn’t throw a ConcurrentModificationException if one thread tries to modify it while another is iterating over it

Difference between synchornizedMap and ConcurrentHashMap

Collections.synchornizedMap(HashMap) will return a collection which is almost equivalent to Hashtable, where every modification operation on Map is locked on Map object while in case of ConcurrentHashMap, thread-safety is achieved by dividing whole Map into different partition based upon concurrency level and only locking particular portion instead of locking whole Map.

ConcurrentHashMap does not allow null keys or null values while synchronized HashMap allows one null keys.

Similar links

Link1

Link2

Performance Comparison

回答13:

ConcurrentHashMap is optimized for concurrent access.

Accesses don\'t lock the whole map but use a finer grained strategy, which improves scalability. There are also functional enhanvements specifically for concurrent access, e.g. concurrent iterators.

回答14:

There is one critical feature to note about ConcurrentHashMap other than concurrency feature it provides, which is fail-safe iterator. I have seen developers using ConcurrentHashMap just because they want to edit the entryset - put/remove while iterating over it. Collections.synchronizedMap(Map) does not provide fail-safe iterator but it provides fail-fast iterator instead. fail-fast iterators uses snapshot of the size of map which can not be edited during iteration.

回答15:

If Data Consistency is highly important - Use Hashtable or Collections.synchronizedMap(Map).
If speed/performance is highly important and Data Updating can be compromised- Use ConcurrentHashMap.

回答16:

Collections.synchronizedMap() method synchronizes all the methods of the HashMap and effectively reduces it to a data structure where one thread can enter at a time because it locks every method on a common lock.

In ConcurrentHashMap synchronization is done a little differently. Rather than locking every method on a common lock, ConcurrentHashMap uses separate lock for separate buckets thus locking only a portion of the Map. By default there are 16 buckets and also separate locks for separate buckets. So the default concurrency level is 16. That means theoretically any given time 16 threads can access ConcurrentHashMap if they all are going to separate buckets.

回答17:

In general, if you want to use the ConcurrentHashMap make sure you are ready to miss \'updates\'
(i.e. printing contents of the HashMap does not ensure it will print the up-to-date Map) and use APIs like CyclicBarrier to ensure consistency across your program\'s lifecycle.

回答18:

Besides what has been suggested, I\'d like to post the source code related to SynchronizedMap.

To make a Map thread safe, we can use Collections.synchronizedMap statement and input the map instance as the parameter.

The implementation of synchronizedMap in Collections is like below

   public static <K,V> Map<K,V> synchronizedMap(Map<K,V> m) {
        return new SynchronizedMap<>(m);
    }

As you can see, the input Map object is wrapped by the SynchronizedMap object.
Let\'s dig into the implementation of SynchronizedMap ,

 private static class SynchronizedMap<K,V>
        implements Map<K,V>, Serializable {
        private static final long serialVersionUID = 1978198479659022715L;

        private final Map<K,V> m;     // Backing Map
        final Object      mutex;        // Object on which to synchronize

        SynchronizedMap(Map<K,V> m) {
            this.m = Objects.requireNonNull(m);
            mutex = this;
        }

        SynchronizedMap(Map<K,V> m, Object mutex) {
            this.m = m;
            this.mutex = mutex;
        }

        public int size() {
            synchronized (mutex) {return m.size();}
        }
        public boolean isEmpty() {
            synchronized (mutex) {return m.isEmpty();}
        }
        public boolean containsKey(Object key) {
            synchronized (mutex) {return m.containsKey(key);}
        }
        public boolean containsValue(Object value) {
            synchronized (mutex) {return m.containsValue(value);}
        }
        public V get(Object key) {
            synchronized (mutex) {return m.get(key);}
        }

        public V put(K key, V value) {
            synchronized (mutex) {return m.put(key, value);}
        }
        public V remove(Object key) {
            synchronized (mutex) {return m.remove(key);}
        }
        public void putAll(Map<? extends K, ? extends V> map) {
            synchronized (mutex) {m.putAll(map);}
        }
        public void clear() {
            synchronized (mutex) {m.clear();}
        }

        private transient Set<K> keySet;
        private transient Set<Map.Entry<K,V>> entrySet;
        private transient Collection<V> values;

        public Set<K> keySet() {
            synchronized (mutex) {
                if (keySet==null)
                    keySet = new SynchronizedSet<>(m.keySet(), mutex);
                return keySet;
            }
        }

        public Set<Map.Entry<K,V>> entrySet() {
            synchronized (mutex) {
                if (entrySet==null)
                    entrySet = new SynchronizedSet<>(m.entrySet(), mutex);
                return entrySet;
            }
        }

        public Collection<V> values() {
            synchronized (mutex) {
                if (values==null)
                    values = new SynchronizedCollection<>(m.values(), mutex);
                return values;
            }
        }

        public boolean equals(Object o) {
            if (this == o)
                return true;
            synchronized (mutex) {return m.equals(o);}
        }
        public int hashCode() {
            synchronized (mutex) {return m.hashCode();}
        }
        public String toString() {
            synchronized (mutex) {return m.toString();}
        }

        // Override default methods in Map
        @Override
        public V getOrDefault(Object k, V defaultValue) {
            synchronized (mutex) {return m.getOrDefault(k, defaultValue);}
        }
        @Override
        public void forEach(BiConsumer<? super K, ? super V> action) {
            synchronized (mutex) {m.forEach(action);}
        }
        @Override
        public void replaceAll(BiFunction<? super K, ? super V, ? extends V> function) {
            synchronized (mutex) {m.replaceAll(function);}
        }
        @Override
        public V putIfAbsent(K key, V value) {
            synchronized (mutex) {return m.putIfAbsent(key, value);}
        }
        @Override
        public boolean remove(Object key, Object value) {
            synchronized (mutex) {return m.remove(key, value);}
        }
        @Override
        public boolean replace(K key, V oldValue, V newValue) {
            synchronized (mutex) {return m.replace(key, oldValue, newValue);}
        }
        @Override
        public V replace(K key, V value) {
            synchronized (mutex) {return m.replace(key, value);}
        }
        @Override
        public V computeIfAbsent(K key,
                Function<? super K, ? extends V> mappingFunction) {
            synchronized (mutex) {return m.computeIfAbsent(key, mappingFunction);}
        }
        @Override
        public V computeIfPresent(K key,
                BiFunction<? super K, ? super V, ? extends V> remappingFunction) {
            synchronized (mutex) {return m.computeIfPresent(key, remappingFunction);}
        }
        @Override
        public V compute(K key,
                BiFunction<? super K, ? super V, ? extends V> remappingFunction) {
            synchronized (mutex) {return m.compute(key, remappingFunction);}
        }
        @Override
        public V merge(K key, V value,
                BiFunction<? super V, ? super V, ? extends V> remappingFunction) {
            synchronized (mutex) {return m.merge(key, value, remappingFunction);}
        }

        private void writeObject(ObjectOutputStream s) throws IOException {
            synchronized (mutex) {s.defaultWriteObject();}
        }
    }

What SynchronizedMap does can be summarized as adding a single lock to primary method of the input Map object. All method guarded by the lock can\'t be accessed by multiple threads at the same time. That means normal operations like put and get can be executed by a single thread at the same time for all data in the Map object.

It makes the Map object thread safe now but the performance may become an issue in some scenarios.

The ConcurrentMap is far more complicated in the implementation, we can refer to Building a better HashMap for details. In a nutshell, it\'s implemented taking both thread safe and performance into consideration.

标签： java dictionary concurrency