Performance Improvements for HashMap in Java 8
Ravi Kant Sharma
Senior Software Architect @ Ericsson | Principle Security Master | Domain Architect | Senior System Manager | Security Researcher | Cloud | Containers
As we know, Hash collision degrades the performance of HashMap significantly. With this new approach, existing applications can expect performance improvements in case they are using HashMaps having large number of elements by simply upgrading to Java 8.
Hash collisions have negative impact on the lookup time of HashMap. When multiple keys end up in the same bucket, then values along with their keys are placed in a linked list. In case of retrieval, linked list has to be traversed to get the entry. In worst case scenario, when all keys are mapped to the same bucket, the lookup time of HashMap increases from O(1) to O(n).
Java 8 has come with the following?improvements/changes?of HashMap objects in case of high collisions.
Above changes ensure performance of O(log(n)) in worst case scenarios (hash function is not distributing keys properly) and O(1) with proper?hashCode().
How linked list is replaced with binary tree?
In Java 8, HashMap replaces linked list with a binary tree when the number of elements in a bucket reaches certain threshold. While converting the list to binary tree, hashcode is used as a branching variable. If there are two different hashcodes in the same bucket, one is considered bigger and goes to the right of the tree and other one to the left. But when both the hashcodes are equal, HashMap assumes that the keys are comparable, and compares the key to determine the direction so that some order can be maintained. It is a good practice to make the keys of HashMap comparable.
This JDK 8 change applies only to?HashMap, LinkedHashMap?and?ConcurrentHashMap.
Based on a simple experiment of creating HashMaps of different sizes and performing put and get operations by key, the following results have been recorded.
1. HashMap.get() operation with proper hashCode() logic
Number Of RecordsJava 5Java 6Java 7Java 810,0004 ms3 ms4 ms2 ms100,0007 ms6 ms8 ms4 ms1,000,00099 ms15 ms14 ms13 ms2. HashMap.get() operation with broken (hashCode is same for all Keys) hashCode() logic
2. HashMap.put() operation with proper hashCode() logic
3. HashMap.put() operation with broken (hashCode is same for all Keys) hashCode() logic
4. HashMap.put() operation with broken (hashCode is same for all Keys) hashCode() logic
The above experiment demonstrates the power of this changed approach in improving the performance of existing Java applications. While this is an internal detail, simply upgrading to JDK 8 from older JDK versions will allow certain applications to see performance improvements if they use HashMaps heavily and have a large amount of entries in the HashMaps.
Lead-Technology at Iris Software Inc.
1 年Thanks for explaining in detail. Very informative and knowledgeable