登录查看更多内容

Amortized O(1) in HashMap. What is it ?

Sarath Pillai

Senior Software Engineer @ Oracle | Ex-Morgan Stanley

发布日期: 2024年9月30日

When we think of a HashMap, the first thought that comes to mind is its ability to fetch/add in constant time complexity.

It is well established that a HashMap has an internal data structure of non-contiguous segments distinguished by their hash values, with each segment consisting of a linked list to resolve hash collisions.

Consider the example:

Assume the hash value is calculated using a prime number, let's say 31, and the hash function is derived using key % 31.

In the first case, 1 % 31 results in 1, and the key-value pair is added to the bucket with hash 1. As it doesn’t have a linked list initially, an underlying linked list is created with {1, John} added as the first node. These values are added successively, and the map is populated.

In the case of a hash collision, such as when the key is 62 and there is already a key 31 in the same segment with hash value 0, the collision is resolved by adding {62, David} as the next node of {31, Ron}.

Now consider an observation:

If I want to fetch {1, John}, my hash function tells me the entry is in the segment with hash value 1. Then I iterate through the underlying linked list and reach the entry with key 1. This takes O(1) time to find the hash segment and O(1) time to find the entry since {1, John} is the first node in the linked list.

Great. Now let’s visualize the situation with {62, David}. It still takes O(1) time to find the hash segment, but since {62, David} is not the first entry, I keep iterating through the linked list until I find the entry by its key. So here, it takes O(1) + O(2) time.

Data & Analytics 5 个月前

How to Think Differently

Harry Powell 2 年前

Week of May 20th

Stefan Krawczyk 6 个月前

As there are more hash collisions, the size of the underlying linked list increases, and the time complexity to fetch an entry is no longer O(1). It becomes O(1) for finding the hash segment plus O(n) for traversing the linked list, where n is the size of the list. This results in an amortized O(1) time complexity.

How can we reduce this to O(1)? This can only be achieved by creating enough segments through a well decided hash function, so that there are no collisions, or by maintaining a maximum threshold for the size of the list entries that can occur.

#hashmap #datastructures #timecomplexity

Amortized O(1) in HashMap. What is it ?

Sarath Pillai

Senior Software Engineer @ Oracle | Ex-Morgan Stanley

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Grind 75 - 2 - Valid Parentheses

Union Type: A Deep Dive

Version Vectors(I)

Chapter 4: Isolation Forest

An Infinite Data Stream with Coroutines in C++20

4) Small Multiples, Huge Advantage

Cow<T> in Rust

Erasers, old tools and garish dataviz!

CUPED - What you know before the experiment matters as much as what you know after

Always Assume You Are Wrong