How PostgreSQL stores data in files, called forks

How PostgreSQL stores data in files, called forks

Thank you so much for reading this edition of the newsletter ?? If you found it interesting, you will also love my courses

  1. System Design Course for Beginners
  2. System Design Course for SDE-2, SDE-3, and above
  3. Redis Internals Course


How PostgreSQL stores data in files, called forks

Physical files (present in the PGDATA directory) are called Forks and PostgreSQL splits the data into multiple forks to manage and optimize different aspects of data storage and retrieval. The three types of forks are -

  1. Main - primary fork where the actual table data is stored
  2. Free Space Map - keeps track of the free space within the main fork
  3. Visibility Map - records which pages in the main fork contain only tuples that are visible to all active transactions.

The file grows over time, and when its size reaches 1GB, another file of this fork (called segment) is created and the sequence number is added to the end of its filename. The limit can be changed while building PostgreSQL.

Each row is stored in a data page (~8 KB in size but configurable), and these pages are linked together to form the complete table. When inserting new data, PostgreSQL first consults the FSM fork to find pages with enough free space. It then writes the new row into the appropriate page in the main fork and updates the FSM.

Note: In PostgreSQL, the physical order of rows on the disk can differ from the logical order defined by the primary key. To physically arrange rows on a disk according to the order of an index (such as the primary key), PostgreSQL offers the CLUSTER command.

Updates are treated as a combination of insert and delete operations. PostgreSQL inserts the new version of the row into the main fork and marks the old version as obsolete. The FSM and VM forks are updated to reflect these changes.

You can find this post on my LinkedIn and Twitter; do leave a like.


By the way,

Being hands-on is the best way for you to learn. Practice interesting programming challenges like building your own BitTorrent client, Redis, DNS server, and even SQLite from scratch on CodeCrafters.

Sign up, and become a better engineer.


?? Video I posted this week

This week I posted How LinkedIn improved their latency by 60%

LinkedIn reduced its latency by 60%. They recently published a blog explaining how they reduced latencies for their inter-service communication by 60% and I dissected it and compiled my learnings in a quick video.


?? Paper I read this week

This week I spent reading Serverless Runtime / Database Co-Design With Asynchronous I/O

This week I am reading a research paper that shows a 100x reduction in tail latencies by keeping database IO asynchronous.

The traditional approach, like using SQLite, leverages synchronous IO which blocks the runtime during database interactions, hurting concurrency and scalability – not ideal for serverless with its multi-tenant nature.

The paper talks about rearchitecting SQLite to be asynchronous i.e. the database interactions wouldn't block the runtime, freeing it to handle other tasks. As per the paper, this improvement enables low-latency access for crucial latency-sensitive workloads running serverless or on edge.

You can download this and other papers I recommend from my papershelf.


Redis is written in C, but its test cases are written in TCL

While going through Redis internals, I looked at test cases to understand the flow, I was surprised to see that the test cases were not written in C, but in TCL, making the entire suite highly readable and extremely simple.

Digging deeper I found out that TCL is pretty popular as a language to test network applications, even SQLite uses it in its test suite. Pretty interesting usecase for a language created way back in 1988 :)

You can find this post on my LinkedIn and Twitter; do leave a like.


?? Interesting articles I read this week

I read a few engineering blogs almost every single day, and here are the three articles I would recommend you to read.


Thank you so much for reading this edition of the newsletter ?? If you found it interesting, you will also love my courses

  1. System Design Course for Beginners
  2. System Design Course for SDE-2, SDE-3, and above
  3. Redis Internals Course


I keep sharing no fluff stuff across my socials, so, if you resonate do give me a follow on Twitter, LinkedIn, YouTube, and GitHub.

Srinath Reddy Sadipiralla

Staff SDE @EDB | PostgreSQL?? Developer

8 个月

kudos Arpit Bhayani great topic and you presented it crisply,you can also add about INIT FORK.

要查看或添加评论,请登录

Arpit Bhayani的更多文章

  • One Career Bet Worth Taking

    One Career Bet Worth Taking

    This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

    5 条评论
  • Leave your job with grace and gratitude

    Leave your job with grace and gratitude

    This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

    7 条评论
  • Turn Boring Projects into Opportunities

    Turn Boring Projects into Opportunities

    This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

    1 条评论
  • When is the right time to switch?

    When is the right time to switch?

    This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

    8 条评论
  • Ramping up faster in your new job

    Ramping up faster in your new job

    This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

    4 条评论
  • Back Your Disagreement with Data

    Back Your Disagreement with Data

    This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

    2 条评论
  • Doubt yourself every day

    Doubt yourself every day

    This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

    9 条评论
  • Not everything needs to be dumbed down

    Not everything needs to be dumbed down

    This edition of the newsletter contains one quick write-up that will help you grow faster in your career a video I…

    11 条评论
  • The best resource does not exist.

    The best resource does not exist.

    This edition of the newsletter contains two quick write-ups about The best resource is mythical Convergent Encryption I…

    4 条评论
  • It's not about what you know, but about how you think

    It's not about what you know, but about how you think

    This edition of the newsletter contains two quick write-ups about It's not about what you know, but about how you think…

    1 条评论

社区洞察

其他会员也浏览了