Performance Improvements for HashMap in Java 8

As we know, Hash collision degrades the performance of HashMap significantly. With this new approach, existing applications can expect performance improvements in case they are using HashMaps having large number of elements by simply upgrading to Java 8.

Hash collisions have negative impact on the lookup time of HashMap. When multiple keys end up in the same bucket, then values along with their keys are placed in a linked list. In case of retrieval, linked list has to be traversed to get the entry. In worst case scenario, when all keys are mapped to the same bucket, the lookup time of HashMap increases from O(1) to O(n).

Java 8 has come with the following?improvements/changes?of HashMap objects in case of high collisions.

  1. The alternative String hash function added in Java 7 has been removed.
  2. Buckets containing a large number of colliding keys will store their entries in a balanced tree instead of a linked list after certain threshold is reached.

Above changes ensure performance of O(log(n)) in worst case scenarios (hash function is not distributing keys properly) and O(1) with proper?hashCode().

How linked list is replaced with binary tree?

In Java 8, HashMap replaces linked list with a binary tree when the number of elements in a bucket reaches certain threshold. While converting the list to binary tree, hashcode is used as a branching variable. If there are two different hashcodes in the same bucket, one is considered bigger and goes to the right of the tree and other one to the left. But when both the hashcodes are equal, HashMap assumes that the keys are comparable, and compares the key to determine the direction so that some order can be maintained. It is a good practice to make the keys of HashMap comparable.

This JDK 8 change applies only to?HashMap, LinkedHashMap?and?ConcurrentHashMap.

Based on a simple experiment of creating HashMaps of different sizes and performing put and get operations by key, the following results have been recorded.

1. HashMap.get() operation with proper hashCode() logic

Number Of RecordsJava 5Java 6Java 7Java 810,0004 ms3 ms4 ms2 ms100,0007 ms6 ms8 ms4 ms1,000,00099 ms15 ms14 ms13 ms2. HashMap.get() operation with broken (hashCode is same for all Keys) hashCode() logic

No alt text provided for this image
No alt text provided for this image

2. HashMap.put() operation with proper hashCode() logic

No alt text provided for this image
No alt text provided for this image

3. HashMap.put() operation with broken (hashCode is same for all Keys) hashCode() logic

No alt text provided for this image
No alt text provided for this image

4. HashMap.put() operation with broken (hashCode is same for all Keys) hashCode() logic

No alt text provided for this image
No alt text provided for this image

The above experiment demonstrates the power of this changed approach in improving the performance of existing Java applications. While this is an internal detail, simply upgrading to JDK 8 from older JDK versions will allow certain applications to see performance improvements if they use HashMaps heavily and have a large amount of entries in the HashMaps.

Sumit Kumar

Lead-Technology at Iris Software Inc.

1 年

Thanks for explaining in detail. Very informative and knowledgeable

要查看或添加评论,请登录

Ravi Kant Sharma的更多文章

  • Biased locking removed from Java 15 - does it impact you?

    Biased locking removed from Java 15 - does it impact you?

    What is biased locking? Biased locking lowers the cost of uncontended/synchronization. It enables a technique for…

  • The OWASP SAMM Model and Structure

    The OWASP SAMM Model and Structure

    At the highest level, SAMM defines five business functions. Each business function is a category of activities that any…

    1 条评论

社区洞察

其他会员也浏览了