What's the difference between ConcurrentHashMa

2018-12-31 12:37发布

I have a Map which is to be modified by several threads concurrently.

There seem to be three different synchronized Map implementations in the Java API:

  • Hashtable
  • Collections.synchronizedMap(Map)
  • ConcurrentHashMap

From what I understand, Hashtable is an old implementation (extending the obsolete Dictionary class), which has been adapted later to fit the Map interface. While it is synchronized, it seems to have serious scalability issues and is discouraged for new projects.

But what about the other two? What are the differences between Maps returned by Collections.synchronizedMap(Map) and ConcurrentHashMaps? Which one fits which situation?

18条回答
怪性笑人.
2楼-- · 2018-12-31 12:48

Besides what has been suggested, I'd like to post the source code related to SynchronizedMap.

To make a Map thread safe, we can use Collections.synchronizedMap statement and input the map instance as the parameter.

The implementation of synchronizedMap in Collections is like below

   public static <K,V> Map<K,V> synchronizedMap(Map<K,V> m) {
        return new SynchronizedMap<>(m);
    }

As you can see, the input Map object is wrapped by the SynchronizedMap object.
Let's dig into the implementation of SynchronizedMap ,

 private static class SynchronizedMap<K,V>
        implements Map<K,V>, Serializable {
        private static final long serialVersionUID = 1978198479659022715L;

        private final Map<K,V> m;     // Backing Map
        final Object      mutex;        // Object on which to synchronize

        SynchronizedMap(Map<K,V> m) {
            this.m = Objects.requireNonNull(m);
            mutex = this;
        }

        SynchronizedMap(Map<K,V> m, Object mutex) {
            this.m = m;
            this.mutex = mutex;
        }

        public int size() {
            synchronized (mutex) {return m.size();}
        }
        public boolean isEmpty() {
            synchronized (mutex) {return m.isEmpty();}
        }
        public boolean containsKey(Object key) {
            synchronized (mutex) {return m.containsKey(key);}
        }
        public boolean containsValue(Object value) {
            synchronized (mutex) {return m.containsValue(value);}
        }
        public V get(Object key) {
            synchronized (mutex) {return m.get(key);}
        }

        public V put(K key, V value) {
            synchronized (mutex) {return m.put(key, value);}
        }
        public V remove(Object key) {
            synchronized (mutex) {return m.remove(key);}
        }
        public void putAll(Map<? extends K, ? extends V> map) {
            synchronized (mutex) {m.putAll(map);}
        }
        public void clear() {
            synchronized (mutex) {m.clear();}
        }

        private transient Set<K> keySet;
        private transient Set<Map.Entry<K,V>> entrySet;
        private transient Collection<V> values;

        public Set<K> keySet() {
            synchronized (mutex) {
                if (keySet==null)
                    keySet = new SynchronizedSet<>(m.keySet(), mutex);
                return keySet;
            }
        }

        public Set<Map.Entry<K,V>> entrySet() {
            synchronized (mutex) {
                if (entrySet==null)
                    entrySet = new SynchronizedSet<>(m.entrySet(), mutex);
                return entrySet;
            }
        }

        public Collection<V> values() {
            synchronized (mutex) {
                if (values==null)
                    values = new SynchronizedCollection<>(m.values(), mutex);
                return values;
            }
        }

        public boolean equals(Object o) {
            if (this == o)
                return true;
            synchronized (mutex) {return m.equals(o);}
        }
        public int hashCode() {
            synchronized (mutex) {return m.hashCode();}
        }
        public String toString() {
            synchronized (mutex) {return m.toString();}
        }

        // Override default methods in Map
        @Override
        public V getOrDefault(Object k, V defaultValue) {
            synchronized (mutex) {return m.getOrDefault(k, defaultValue);}
        }
        @Override
        public void forEach(BiConsumer<? super K, ? super V> action) {
            synchronized (mutex) {m.forEach(action);}
        }
        @Override
        public void replaceAll(BiFunction<? super K, ? super V, ? extends V> function) {
            synchronized (mutex) {m.replaceAll(function);}
        }
        @Override
        public V putIfAbsent(K key, V value) {
            synchronized (mutex) {return m.putIfAbsent(key, value);}
        }
        @Override
        public boolean remove(Object key, Object value) {
            synchronized (mutex) {return m.remove(key, value);}
        }
        @Override
        public boolean replace(K key, V oldValue, V newValue) {
            synchronized (mutex) {return m.replace(key, oldValue, newValue);}
        }
        @Override
        public V replace(K key, V value) {
            synchronized (mutex) {return m.replace(key, value);}
        }
        @Override
        public V computeIfAbsent(K key,
                Function<? super K, ? extends V> mappingFunction) {
            synchronized (mutex) {return m.computeIfAbsent(key, mappingFunction);}
        }
        @Override
        public V computeIfPresent(K key,
                BiFunction<? super K, ? super V, ? extends V> remappingFunction) {
            synchronized (mutex) {return m.computeIfPresent(key, remappingFunction);}
        }
        @Override
        public V compute(K key,
                BiFunction<? super K, ? super V, ? extends V> remappingFunction) {
            synchronized (mutex) {return m.compute(key, remappingFunction);}
        }
        @Override
        public V merge(K key, V value,
                BiFunction<? super V, ? super V, ? extends V> remappingFunction) {
            synchronized (mutex) {return m.merge(key, value, remappingFunction);}
        }

        private void writeObject(ObjectOutputStream s) throws IOException {
            synchronized (mutex) {s.defaultWriteObject();}
        }
    }

What SynchronizedMap does can be summarized as adding a single lock to primary method of the input Map object. All method guarded by the lock can't be accessed by multiple threads at the same time. That means normal operations like put and get can be executed by a single thread at the same time for all data in the Map object.

It makes the Map object thread safe now but the performance may become an issue in some scenarios.

The ConcurrentMap is far more complicated in the implementation, we can refer to Building a better HashMap for details. In a nutshell, it's implemented taking both thread safe and performance into consideration.

查看更多
看淡一切
3楼-- · 2018-12-31 12:50

The "scalability issues" for Hashtable are present in exactly the same way in Collections.synchronizedMap(Map) - they use very simple synchronization, which means that only one thread can access the map at the same time.

This is not much of an issue when you have simple inserts and lookups (unless you do it extremely intensively), but becomes a big problem when you need to iterate over the entire Map, which can take a long time for a large Map - while one thread does that, all others have to wait if they want to insert or lookup anything.

The ConcurrentHashMap uses very sophisticated techniques to reduce the need for synchronization and allow parallel read access by multiple threads without synchronization and, more importantly, provides an Iterator that requires no synchronization and even allows the Map to be modified during interation (though it makes no guarantees whether or not elements that were inserted during iteration will be returned).

查看更多
长期被迫恋爱
4楼-- · 2018-12-31 12:50

The main difference between these two is that ConcurrentHashMap will lock only portion of the data which are being updated while other portion of data can be accessed by other threads. However, Collections.synchronizedMap() will lock all the data while updating, other threads can only access the data when the lock is released. If there are many update operations and relative small amount of read operations, you should choose ConcurrentHashMap.

Also one other difference is that ConcurrentHashMap will not preserve the order of elements in the Map passed in. It is similar to HashMap when storing data. There is no guarantee that the element order is preserved. While Collections.synchronizedMap() will preserve the elements order of the Map passed in. For example, if you pass a TreeMap to ConcurrentHashMap, the elements order in the ConcurrentHashMap may not be the same as the order in the TreeMap, but Collections.synchronizedMap() will preserve the order.

Furthermore, ConcurrentHashMap can guarantee that there is no ConcurrentModificationException thrown while one thread is updating the map and another thread is traversing the iterator obtained from the map. However, Collections.synchronizedMap() is not guaranteed on this.

There is one post which demonstrate the differences of these two and also the ConcurrentSkipListMap.

查看更多
看风景的人
5楼-- · 2018-12-31 12:52
╔═══════════════╦═══════════════════╦═══════════════════╦═════════════════════╗
║   Property    ║     HashMap       ║    Hashtable      ║  ConcurrentHashMap  ║
╠═══════════════╬═══════════════════╬═══════════════════╩═════════════════════╣ 
║      Null     ║     allowed       ║              not allowed                ║
║  values/keys  ║                   ║                                         ║
╠═══════════════╬═══════════════════╬═════════════════════════════════════════╣
║Is thread-safe ║       no          ║                  yes                    ║
╠═══════════════╬═══════════════════╬═══════════════════╦═════════════════════╣
║     Lock      ║       not         ║ locks the whole   ║ locks the portion   ║        
║  mechanism    ║    applicable     ║       map         ║                     ║ 
╠═══════════════╬═══════════════════╩═══════════════════╬═════════════════════╣
║   Iterator    ║               fail-fast               ║ weakly consistent   ║ 
╚═══════════════╩═══════════════════════════════════════╩═════════════════════╝

Regarding locking mechanism: Hashtable locks the object, while ConcurrentHashMap locks only the bucket.

查看更多
初与友歌
6楼-- · 2018-12-31 12:54

In general, if you want to use the ConcurrentHashMap make sure you are ready to miss 'updates'
(i.e. printing contents of the HashMap does not ensure it will print the up-to-date Map) and use APIs like CyclicBarrier to ensure consistency across your program's lifecycle.

查看更多
残风、尘缘若梦
7楼-- · 2018-12-31 12:55

In ConcurrentHashMap, the lock is applied to a segment instead of an entire Map. Each segment manages its own internal hash table. The lock is applied only for update operations. Collections.synchronizedMap(Map) synchronizes the entire map.

查看更多
登录 后发表回答