WAL - Distributed Design Patterns
Pratik Pandey
Senior Software Engineer at Booking.com | AWS Serverless Community Builder | pratikpandey.substack.com
There are a lot of distributed system patterns, that are leveraged widely by different distributed systems. As part of this series, I'll go over some of the distributed design patterns, that will help you in developing your own distributed systems or understanding the ones you already use.
Strong durability is a default, that we assume out of all our data stores. But have you wondered, how this high durability is achieved with all the high performance/throughput requirements, even in the face of server crashes(power failures/os failures/hardware failures). Well, the answer is WAL.
WAL - Write Ahead Log
WAL isn't truly a distributed design pattern, but widely used pattern that is prevalent in most distributed systems guaranteeing durability.
WAL has been a common theme across traditional RDBMS systems, and is used to help with atomicity & durability(A & D of ACID) guarantees. All mutations to a table are written first to the WAL(Transaction Log/Bin Log) & then applied to the actual data files of the tables asynchronously.
Sample WAL and WALEntry structure:
type WAL struct {
dir string // directory under which WAL files are stored.
file *os.File // reference to the file
metadata []byte // metadata recorded at the head of each WAL
decoder *decoder // decoder to decode records
encoder *encoder // encoder to encode records
mutex sync.Mutex // To ensure single update per writer
lastIndex uint64 // index of the last entry saved to the WAL
}
type WALEntry struct {
lsn uint64 // unique identifier for each log entry
data []byte // actual WAL entry in bytes
crc uint32 // crc for data integrity validation
type uint32 // type of wal record
}
Life Without WAL
You must be wondering, why do we really need WAL? Why not flush the changes directly to the actual data files instead of WAL?
There are 2 aspects to it -
WAL to the Rescue
WAL is an append only log, which stores each state change on the data store, as a command. Now, instead of flushing the data mutations on disk for different tables/entities, we just flush the command/operation in the WAL(1 disk operation).
A separate async process can read the operations from the WAL, and then apply the data mutations to the actual data files on the disk following the normal flow through the different caches. This really helps in improving the write throughput for the data stores.
Also, in case of failures, there could be changes that might not have been applied to the data files. However, since we have the operations present in the WAL file, we could replay the operations from WAL and apply them to bring the data store back to a consistent state. Thus, WAL helps us in ensuring data integrity and reliability, while still allowing high write throughput for our data stores.
Real Life Implementation Considerations
1. Flushing WAL Operations to Disk -
As mentioned before, writes to disk might not be flushed directly. Most File Handling libraries, offer you ways to force the OS to flush the changes directly to the disk sectors. While flushing on each write will give you strong durability guarantees, it can potentially cause performance issues in a write heavy system. This is where the tradeoff needs to be made, where you can have either a flush frequency or micro batches or both to flush the changes to disk, to help with the performance. Note that there is a risk of data loss here.
2. Corruption Detection -
We need to ensure that any operation flushed to disk is not corrupted. For this, we make sure that each WAL Record also contains a CRC value, which can be used to validate when the record is read from WAL and ensure that there is no corruption.
3. Operation Duplication -
Since WAL is an append only file, in case the client retries because of a communication failure, you can have scenarios of duplicate operations being written on the WAL. Hence whenever we are reading the WAL, to apply the changes to the actual data store, we need to ensure that either the duplicates are ignored, or the application of operations to actual data store is Idempotent.
Real Life Usage
1) All databases, including NoSQL databases like Cassandra make use of WAL to guarantee durability.
2) Kafka makes use of a similar structure as WAL(Commit Log).
3) KV Stores like Rocks DB, Level DB & Distributed caches like Apache Ignite also use WAL.
Summary
To sum it up, WAL allows us -
1) Faster performance & Throughput as we avoid data flush/disk writes for all changes.
2) Recoverability in case of restarts as operations can be applied from WAL to actual data store.
3) Ability to restore to a point in time snapshot, since we have all operations present in the WAL.
Thank you for reading! I'll be posting more such content on distributed systems & patterns, so please like, share and subscribe to this newsletter for notifications of new posts.
Please comment on the post with your feedback, will really help me improve! :)
Until next time, keep asking questions & keep learning!
Magician Architect with Midas touch
1 年Once there's operation in the WAL updating record A=[1,2,3] updating it to A'=[2,4,6] which was not flushed, then if we read in another thread that record we will get A or A'? Is there some mechanism to check if there was operation on that record in WAL and then delay reading till WAL is flushed?
Senior Project Manager at Franklin Templeton
2 年Great article Pratik. Thank you. I can't remember the last time i did true programming but a couple of questions to help me understand the concept better please.. 1. Is this log(containing the actual WAL entry) just a file with strings (could be compressed/encoded)? 2. Is this log duplicated and stored on multiple nodes? What ensures the server this log sits on does not have the same issues that this log solves for? 3. In cases where BLOBs are stored on the Database, do they go through this as well? Thanks for sharing your knowledge with us
Technology and Delivery Leader @ S&P Global | Building technologically strategic solutions
2 年This is a great start with a useful topic. While I knew about Commit logs, I didn't know the term "WAL" until I read this article. Pretty informative! Are there any libraries (Java/C# for e.g.) that help in implementing this design pattern keeping performance and throughput into consideration (as opposed to developing this from scratch)?