Amortized O(1) in HashMap. What is it ?
When we think of a HashMap, the first thought that comes to mind is its ability to fetch/add in constant time complexity.
It is well established that a HashMap has an internal data structure of non-contiguous segments distinguished by their hash values, with each segment consisting of a linked list to resolve hash collisions.
Consider the example:
Assume the hash value is calculated using a prime number, let's say 31, and the hash function is derived using key % 31.
In the first case, 1 % 31 results in 1, and the key-value pair is added to the bucket with hash 1. As it doesn’t have a linked list initially, an underlying linked list is created with {1, John} added as the first node. These values are added successively, and the map is populated.
In the case of a hash collision, such as when the key is 62 and there is already a key 31 in the same segment with hash value 0, the collision is resolved by adding {62, David} as the next node of {31, Ron}.
Now consider an observation:
If I want to fetch {1, John}, my hash function tells me the entry is in the segment with hash value 1. Then I iterate through the underlying linked list and reach the entry with key 1. This takes O(1) time to find the hash segment and O(1) time to find the entry since {1, John} is the first node in the linked list.
Great. Now let’s visualize the situation with {62, David}. It still takes O(1) time to find the hash segment, but since {62, David} is not the first entry, I keep iterating through the linked list until I find the entry by its key. So here, it takes O(1) + O(2) time.
领英推荐
As there are more hash collisions, the size of the underlying linked list increases, and the time complexity to fetch an entry is no longer O(1). It becomes O(1) for finding the hash segment plus O(n) for traversing the linked list, where n is the size of the list. This results in an amortized O(1) time complexity.
How can we reduce this to O(1)? This can only be achieved by creating enough segments through a well decided hash function, so that there are no collisions, or by maintaining a maximum threshold for the size of the list entries that can occur.
#hashmap #datastructures #timecomplexity