Why ConcurrentHashMap is better than Hashtable and just as good as a HashMap

February 5th, 2009 | Tags:

ConcurrentHashMap is a pretty ignored class. Not many people know about it and not many people care to use it. The class offers a very robust and fast (comparatively, we all know java concurrency isn’t the fastest) method of synchronizing a Map collection.

I have read a few comparisons of HashMap and ConcurrentHashMap on the web. Let me just say that they’re totally wrong. There is no way you can compare the two, one offers synchronized methods to access a map while the other offers no synchronization whatsoever. What most of us fail to notice is that while our applications, web applications especially, work fine during the development & testing phase, they usually go tits up under heavy (or even moderately heavy) load. This is due to the fact that we expect our HashMap’s to behave a certain way but under load they usually misbehave.

Hashtable’s offer concurrent access to their entries, with a small caveat, the entire map is locked to perform any sort of operation. While this overhead is ignorable in a web application under normal load, under heavy load it can lead to delayed response times and overtaxing of your server for no good reason.

This is where ConcurrentHashMap’s step in. They offer all the features of Hashtable with a performance almost as good as a HashMap. ConcurrentHashMap’s accomplish this by a very simple mechanism. Instead of a map wide lock, the collection maintains a list of 16 locks by default, each of which is used to guard (or lock on) a single bucket of the map. This effectively means that 16 threads can modify the collection at a single time (as long as they’re all working on different buckets). Infact there is no operation performed by this collection that locks the entire map. The concurrency level of the collection, the number of threads that can modify it at the same time without blocking, can be increased. However a higher number means more overhead of maintaining this list of locks.

Retrieval operations on a ConcurrentHashMap do not block unless the entry is not found in the bucket or if the value of the entry is null. In such a case the map synchronizes on the bucket and then tries to look for the entry again just in case the entry was put or removed right after the get in synchronized mode.

Removal operations do require a bit of overhead. All removal operations require the chain of elements before and after to be cloned and joined without the removed element. Since the value of the map key is volatile (not really, the value of the inner Entry class is volatile) if a thread already traversing the bucket from which a value is removed reaches the removed element, it automatically sees a null value and knows to ignore such a value.

Traversal in a ConcurrentHashMap does not synchronize on the entire map either. Infact traversal does not synchronize at all except under one condition. The internal LinkedList implementation is aware of the changes to the underlying collection. If it detects any such changes during traversal it synchronizes itself on the bucket it is traversing and then tries to re-read the values. This always insures that while the values recieved are always fresh, there is minimalistic locking if any.

Iteration over a ConcurrentHashMap are a little different from those offered by other collections. The iterators are not fail-fast in the sense that they do not throw a ConcurrentModificationException. They also do not guarantee that once the iterator is created it will list/show all elements that are added after its creation. The iterators do however guarantee that any updates or removal of items will be reflected correctly in their behaviour. They also guarantee that no element will be returned more than once while traversal.

In conclusion, give it a try, replace some Hashtable’s in your application with ConcurrentHashMap and see how they perform under load. The two are interchangeable so it shouldn’t be hard to update your app. Happy changing!

  1. rajeshksv
    September 14th, 2009 at 12:19
    Quote | #1

    Nice tutorial.It clearly explains why Concurrent hashmap is to be used
    I am using Concurrent Hashmap for this code— Is this threadsafe??
    Why cant I use hashmap atleast in this case where no two threads access the same key(since I am using threadId itself as hashmap key) and no thread iterates over the map and all the threads will only put/get values

    private static Map<String,Context> threadData = new ConcurrentHashMap<String,Context>();
    public static Context getContext() {
    String threadId = String.valueOf(Thread.currentThread().getId());
    if (threadData.get(threadId) == null) {
    Context context = new Context();
    threadData.put(threadId, context);
    }
    return (threadData.get(threadId));
    }

    Thanks in Advance

    • Shrini
      March 4th, 2010 at 10:48
      Quote | #2

      @rajeshksv

      How about storing context in thread local ? But this does not answer your question though ;)

    • Vivek
      February 7th, 2011 at 22:21
      Quote | #3

      Well,
      Your code snippet is not thread-safe if seen from the actual implementation point of view. Bcoz the entire operation of search & insertion of Key should have been an atomic operation but is not.

      But however, you still are safe from any threading issues with regard to the above implementation bcoz, the threadId in your case is specific to the calling Thread. So, at any given point in time the Thread will check its own identity in the map and hence you are left safe.

  2. January 28th, 2010 at 23:28
    Quote | #4

    Im wondering if the ConcurrentHashMap is only fast if it comes to multiple thread operations. I tried it in a single threaded application and i performed worse than a hashtable.. So what if there are only two threads accessing the datastructure, is Hastable or CHM faster?

    Regards

  3. December 13th, 2010 at 00:57
    Quote | #5

    In the 5th para you do explain about how CHMs could have null. I wrote an article about why CHMs dont offer null values (http://anshuiitk.blogspot.com/2010/12/why-concurrenthashmap-does-not-support.html). However the best tradeof is to use putIfAbsent in case of CHMs. Its very important. Nice and informative article.

  4. January 8th, 2011 at 12:33
    Quote | #6

    ConcurrentHashMap is indeed best choice in case of multithreaded environment if numbers of reader is much greater than number of writer to avoid contention and to increase throughput and performance but deciding between SynchrnozedHashMap and ConcurrentHashMap is still requires understanding of usecases and actual environment.

    Thanks
    Javin
    FIX Protocol tutorial

  5. Aaron Li
    September 17th, 2011 at 08:43
    Quote | #7

    If I am not wrong the maximum number of lock is 2 power 16, not 16 only

  6. Rags
    November 25th, 2011 at 16:59
    Quote | #8

    Better explanation is here: http://stackoverflow.com/questions/2836267/concurrenthashmap-in-java specifically “All operations are threadsafe, but there is no happens before promise like there would be with a synchronized map. What you see when you look in a ConcurrentHashMap is the results of the most recently completed operations. Some of which may have begun significantly later in time than when you started trying to look. It works rather like a read-committed database isolation, whereas a synchronized map works more like a serializable database isolation. – Affe May 14 ’10 at 17:46″

  7. Rohit
    September 18th, 2012 at 15:51
    Quote | #9
  8. Satish
    October 28th, 2012 at 23:34

    See http://javaopensourcecode.blogspot.com/2012/06/concurrenthashmap.html
    Article describes in detail about the internals of HashMap and CocurrentHashMap

  9. Yukti Kaura
    January 27th, 2013 at 09:15

    Thanks for this lucid and well explained detail.This is better than JavaDocs!

  10. May 2nd, 2013 at 21:35

    I pay a quick visit daily a few web sites and websites to read content, but this webpage presents quality based posts.

Comments are closed.