登录查看更多内容

Hash Table Internals - Part 1 - Internal Structure

Arpit Bhayani

发布日期: 2022年8月16日

Hash Tables are implemented through simple arrays, but how?

Hash Tables are so powerful, that OOP-based languages internally use them to power Classes and site members. Symbol tables that hold the variables mapped to a memory location are also powered through hash tables.

They are designed to provide constant-time key-based insertion, update, and lookups while being space efficient at all times.

Core Ideas to construct Hash Tables

convert application keys to wide-ranged (INT32) hash keys
convert hash keys to a smaller range

Application Keys to Hash Keys

Hash Tables should support storing an object as a key and to power that the keys are first hashed to a big integer range (provided by the user) typically INT32. This hash key is then further used to decide how and where the KV pair would be stored in the data structure.

Naive Implementation

A naive implementation of a Hash Table would be to create an array of length INT32. To store the KV in it, we pass the key through the hash function, spitting out an integer. We use this key and store the KV pair at this index in the array.

Although this would give us constant time insertion, update, and lookups, it is highly space in-efficient, as we would need to allocate at least 4 * INT32 = 16GB of ram to just hold this array, with most of the slots left empty.

Hash Keys to Smaller Range

This step is designed and introduced to make our Hash Table space efficient. Instead of having a huge array of length INT32, we keep it proportional to the number of keys inserted. For example, if we inserted 4 keys, then our holding array could be around 8 slots big.

To achieve this, we map the hash key into a small range (same as the length of the array) and place our key at that very index. This allows us to remain space-efficient while sporting fast and efficient insertions, updates, and lookups.

领英推荐

Sliding Window Technique in Data Structures and…

Hari Mohan Prajapat 2 个月前

Rust Memory Layouts in?Practice

Luis Soares 8 个月前

Stack and Heap Memory in .NET

Satya Prakash Chhikara 5 个月前

Adding more keys

The small limited-size array will not be able to hold numerous keys and hence after a certain stage we would need a larger array to hold the data. This is done by resizing the holding array and is typically made 2x every time it is full enough.

Thus, this two-step implementation allows for near-constant time insertions, updates, and lookups while remaining space efficient.

Here's the video of my explaining this in-depth ?? do check it out

Thank you so much for reading ?? If you found this helpful, do spread the word about it on social media; it would mean the world to me.

If you liked this short essay, you might also like my courses on

I teach an interactive course on System Design where you'll learn how to intuitively design scalable systems. The course will help you

become a better engineer
ace your technical discussions
get you acquainted with a spectrum of topics ranging from Storage Engines, High-throughput systems, to super-clever algorithms behind them.

I have compressed my ~10 years of work experience into this course, and aim to accelerate your engineering growth 100x. To date, the course is trusted by 800+ engineers from 11 different countries and here you can find what they say about the course.

Together, we will dissect and build some amazing systems and understand the intricate details. You can find the week-by-week curriculum and topics, testimonials, and other information at https://arpitbhayani.me/masterclass.

Arpit's Newsletter

121,393 位关注者

Rafael Camara

Software Engineer | Open Source Contributor

2 年

Very nice article! Regarding the initial size of the array, what would you recommend it to be? What strategies to use to find this optimal initial size?

Arpit Bhayani

2 年

More about me: arpitbhayani.me Newsletter: arpitbhayani.me/newsletter Subscribe #AsliEngineering for such in-depth engineering concepts: https://www.youtube.com/c/ArpitBhayani System Design course: arpitbhayani.me/masterclass Microservices: https://courses.arpitbhayani.me/designing-microservices All GitHub Outages: https://courses.arpitbhayani.me/github-outage-dissections/

2 次回应

查看更多评论

要查看或添加评论，请登录

Arpit Bhayani的更多文章

The Ideal End To An Ideal Career

2025年3月23日

The Ideal End To An Ideal Career

This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

6 条评论
How to Find and Ride the Next Tech Wave

2025年3月16日

How to Find and Ride the Next Tech Wave

This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

6 条评论
Engineer or Manager? How to Decide Your Path

2025年3月9日

Engineer or Manager? How to Decide Your Path

This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

7 条评论
One Career Bet Worth Taking

2025年3月2日

One Career Bet Worth Taking

This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

5 条评论
Leave your job with grace and gratitude

2025年2月23日

Leave your job with grace and gratitude

This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

7 条评论
Turn Boring Projects into Opportunities

2025年2月16日

Turn Boring Projects into Opportunities

This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

1 条评论
When is the right time to switch?

2025年2月10日

When is the right time to switch?

This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

8 条评论
Ramping up faster in your new job

2025年2月2日

Ramping up faster in your new job

This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

4 条评论
Back Your Disagreement with Data

2025年1月26日

Back Your Disagreement with Data

This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

2 条评论
Doubt yourself every day

2025年1月19日

Doubt yourself every day

This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

9 条评论

See all articles

Hash Table Internals - Part 1 - Internal Structure

Arpit Bhayani

Core Ideas to construct Hash Tables

Application Keys to Hash Keys

Naive Implementation

Hash Keys to Smaller Range

领英推荐

Adding more keys

Arpit's Newsletter

121,393 位关注者

Arpit Bhayani的更多文章

社区洞察

其他会员也浏览了

C++20: An Infinite Data Stream with Coroutines

Remove duplicates from an ArrayList

Type erasure in C++

C++ Core Guidelines: More Rules to Performance

Understand how to use Hash Map (C++) in brief

C++ Class Layout

Understanding the GraphQL Type System

Md. Jubaer Mahmud Sarker -Implementing a Stack using Linked List in C++

Understanding association between two categorical variables: Contingency Table

Freezed 3.0.0: The Future of Immutable Data Classes in Dart

Core Ideas to construct Hash Tables

Application Keys to Hash Keys

Naive Implementation

Hash Keys to Smaller Range

领英推荐

Adding more keys

Arpit's Newsletter

121,393 位关注者

Arpit Bhayani的更多文章

The Ideal End To An Ideal Career

How to Find and Ride the Next Tech Wave

Engineer or Manager? How to Decide Your Path

One Career Bet Worth Taking

Leave your job with grace and gratitude

Turn Boring Projects into Opportunities

When is the right time to switch?

Ramping up faster in your new job

Back Your Disagreement with Data

Doubt yourself every day

社区洞察

其他会员也浏览了

C++20: An Infinite Data Stream with Coroutines

Remove duplicates from an ArrayList

Type erasure in C++

C++ Core Guidelines: More Rules to Performance

Understand how to use Hash Map (C++) in brief

C++ Class Layout

Understanding the GraphQL Type System

Md. Jubaer Mahmud Sarker -Implementing a Stack using Linked List in C++

Understanding association between two categorical variables: Contingency Table

Freezed 3.0.0: The Future of Immutable Data Classes in Dart