CAP Theorem – Demystified

CAP Theorem – Demystified

My better half decides to have a serious conversation with me. She calls me on cell.

Cell reception at my end is sporadic.

Scenario 1:

I receive her call. 

Owing to bad reception though, I am unable to communicate clearly with her.

I move around to get better reception. I probably switch to VOIP calling. I communicate through SMS.

I choose not to respond until I have clearly understood the ask.

Else I let her know that I am unable to hear her.

I respond only once the communication is clear.

I engage. I understand the ask. My responses are active and current.

That’s consistency.


Scenario 2:

I receive her call.

Owing to bad reception though, I am unable to communicate clearly with her.

I think back on prior conversations. I go back to the last ask & assume context.

I give her occasional hmm, huh, etc. I progress the conversation but without current context.

I engage. My responses though are “stale”.

That’s availability.


If we would have been in the same room physically then above challenges would have been void. And that’s the important factor – we are in different locations. 

We are “distributed”.

We are dependent on the cell service. We are dependent on the network. We need to tolerate network problems.

That’s partition tolerance (network failure).

Bad cellular network being the constant, I can either be consistent or I can simply be available for the conversation. But not both.

And that’s CAP theorem for you. It states, courtesy Wikipedia: “In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees: [1] consistency, [2] availability, [3] partition tolerance.”.

Above definition is misleading though. 

CAP theorem is not about 2 out of 3 but rather 1 out of 2 wherein 3rd is the constant.


Let’s elaborate.

Assume we have a single data store with no replication. All reads and writes happen to that same data store. But then that single data store can potentially become a single point of failure. Also, it does not support scalability.

And so, to increase the fault tolerance, that single data store is now replicated. In addition to replication, sharding is implemented ie. the data is now truly distributed.

Major challenge of such distributed data stores is affecting a write onto all data stores. In case of network failure (partition), the latest write does not make its way to all the data stores. If a client reads from such data-stores that are missing the latest writes/updates then retrieval function has 2 options:

  1. Ensure that the data is in eventual-consistent state. If unable to ensure then return an error. Return value only if sure. CONSISTENCY.
  2. Don’t care about data’s latest state. Simply return the last known value. AVAILABILITY.

Network failure in above is the CONSTANT. 

This is what CAP theorem means: Distributed systems have to be partition tolerant ie. respect and accommodate network outages – whether such tolerance leads to consistency or availability is use-case dependent.

I used a term above called “eventual consistency”. 

Though “Eventual” and “Consistent” sound like an oxymoron, it’s a by-product of distributed systems. 

It’s a fact, like real-time or near-real-time. Eventual consistency means that the distributed data stores will eventually become consistent and that too in a sequential manner ie. the sequence of the incoming events will be respected – ordering!

Example:

  1. All data-stores have the fact “I love thin crust pizza”.
  2. I update my pizza preference as “I love deep dish pizza”
  3. Network partition occurs: Data-store-1 is now updated to convey “I love deep dish pizza”
  4. Network partition occurs: Data-store-2 still believes that I make healthy eating choices “I love thin crust pizza”
  5. Maybe this is on a social media site, people with access to data-store-2 consider me healthy while the reality is otherwise
  6. And this is where eventual-consistency plays its part and ensures that data-store-2 is also made aware at some point that my eating habits have gone for a toss

There are structures like conflict-free replicated data types (CRDTs) that facilitate ordering in the eventually consistent state requirement. Above example is an over-simplification. Assume that different data-stores somehow got different values for the same object but were unable to interact with each other at those points. Once eventual consistent state is gained, the ordering becomes important to ensure that the final value of the object is indeed the final intended value by the client.

Back to CAP Theorem. With data intensive and distributed applications on the rise, fueled by growing cloud provisioning, data persistence across multiple storage systems has become a norm.

Because the data stores are distributed, they are victims of network failure and hence need to plan for their partition tolerance strategy: Consistency OR Availability.

Preference is scenario/requirement based. For transactions of financial nature, consistency is key. For social media data sets, availability is key. Again, no clear-cut approach. It’s all dependent on the use-case.

For instance, let’s assume that banks favored availability over consistency.

  1. I transfer money from my account and log off. Amount debited from my account is persisted to data-store-1
  2. I log in again and probably hit a different data store this time, say data-store-2
  3. Owing to network issues, the debit that got affected to data-store-1 never reached data-store-2
  4. But hey, the bank favors availability and so who are we to argue!?!
  5. Since availability is the preference, I get to see “stale” account balance, in this case, I see my balance from before the debit on step 1
  6. I am happy – I run to the ATM to withdraw the tech-induced bonus
  7. Now whether the ATM connects to data-store-1 or to data-store-2 or data-store-n is up to you – feel free to move this story forward per your preference

Above scenario is just to highlight the difference between consistency and availability. Both are equally important, one more than the other for some…

From an implementation perspective, HBase, MongoDB, Redis support consistency while Cassandra is a highly available database.

Before concluding, let’s review whether CAP theorem highlights a limitation or not.

In a way, yes, it does. But this limitation is not about a choice between consistency or availability. It’s simply a choice between speed/performance or the lack of it. It’s all about improving the response times. Coz eventual-consistency as-it-is guarantees ordered consistency along with availability.

Ordered eventual consistency is not easy to implement – it’s a combination of multiple techniques:

  • distributed databases
  • replication models
  • distributed locking
  • consensus algorithms
  • uniqueness constraints

I am targeting to cover some of these techniques in my upcoming blogs.

Hope you find this useful. Open to review/feedback. 

Thank you!

要查看或添加评论,请登录

Taher Borsadwala的更多文章

  • Transfer Crypto from Exchange-based Wallet to Self-Owned Wallet

    Transfer Crypto from Exchange-based Wallet to Self-Owned Wallet

    Context: In recent light of Coinbase announcing that customers may lose their crypto in case the company goes bankrupt…

  • Max 21 million Bitcoins by 2140 Explained

    Max 21 million Bitcoins by 2140 Explained

    21 million bitcoins only – that’s the max limit. And that too by 2140.

  • Are Blockchains & Crypto Assets safe?

    Are Blockchains & Crypto Assets safe?

    Obvious & valid question coz we hear about cryptos getting stolen almost every day and that too in huge chunks! Simple…

  • Multi-Party Computation, simplified.

    Multi-Party Computation, simplified.

    Why is MPC (multi-party computation) needed? In CryptoLand, the KEY is KEY – meaning you lose your private key or your…

    5 条评论
  • Magic behind Bitcoin Transactions Explained

    Magic behind Bitcoin Transactions Explained

    Bitcoin ecosystem is all about transactions – creation, propagation, validation, confirmation & more. You might have…

  • Serialization frameworks, simplified.

    Serialization frameworks, simplified.

    Serialization Frameworks?!? Serialization frameworks are translators! They enable use of objects between and across…

  • Two-phase commit, simplified.

    Two-phase commit, simplified.

    Distributed computing / Distributed Data Systems face a bunch of challenges, consensus being a crucial one. Getting…

  • Cloud for Scalability!

    Cloud for Scalability!

    Designing and building systems that work sufficiently well in the present is ironically, insufficient. Reliability of…

  • NFTs for Dummies

    NFTs for Dummies

    …by a dummy. There now, this confession should preserve your sense of self-respect enough to read this blog.

  • One for all n all for one! Idempotency!

    One for all n all for one! Idempotency!

    Yes, I have tweaked the motto that our heroes from The Three Musketeers live by. Let me explain why through some…

社区洞察

其他会员也浏览了