CAP Theorem, aka the Fundamental Knowledge Your Team Need to Learn!
Time to uncover the beast!

CAP Theorem, aka the Fundamental Knowledge Your Team Need to Learn!

Going to open a bit strong; I am still baffled about how many times, I have seen programmers struggling with simple issues, because they never learned this theorem.

And if you never heard about it, then this article will change the way you think about distributed computing. Please wait, before You close the article while thinking for yourself that "I never do distributed computing, so it's not for me" because there is no way that your work is not affected by it, and I can prove it!

Let's do a small time travel... Around 20 years ago when I wrote my first webpage, PHP was the de facto choice of language for web development; Easy to learn, with a rather vivid community. And I am not alone, even tho, PHP slowly losing it's popularity against modern languages, but it still the 8th most popular language today, and the undisputed king of the web applications.

PHP popularity over the years

It is still one of the most productivity oriented language! Don't hate me pls :3

So, how it is related, You ask? Because programmers are human beings (allegedly), they are hard bound to protect their first choice retroactively. So, if they learn PHP / Python / Java etc... as their first language, they will spend a substantial amount of their life arguing that, it's their favorite language is the one and only best choice ??

But those language share a common problem, they make parallelization rather cumbersome. And we have arrived to our first stop, a clue to the problem. Many if not most programmer have spent their life writing code without ever facing asynchronous problems.

So, what? How is this related to CAP and learning? ~ Thought by the Reader

We are getting there, I just wanted to state a decade old observation which will make sense in the following segments. Trust me, I am an engineer! Duh!

Let's get to it! CAP theorem is about the impossibility to ensure all of the following guarantees in the same system:

  • Consistency
  • Availability
  • Partition tolerance

You are reading this article on a system which has multiple computing cores, and already handling a distributed computing problem in it's CPU!

No alt text provided for this image

It's everywhere! Just carefully hidden away from You!

We could sit here all day long as I list every system that hides the limitations from you. But the problem is the hiding and the struggle comes from not seeing it.

Now we are back on the track, and it's time to show You, how the problem can be beaten, and when it's understood, it will save you or your team from the struggle.

Why should I care if it's hidden?

Because when you have to interact with those system as an engineer you will run into the limitations, and those limitations can hinder your work or cause issues which you can avoid.

I had hard time to pick the most blatant system where a CAP causing issues, but not because it's hard to find, instead there is so many... So, we will take a look at the most popular database MySQL and the common web dev stack, there is a good chance that You are either worked or currently working with it.

MySQL logo

Trivia: Did You know that MySQL has a blackhole engine? It's super useful!

We will start with a single node MySQL database. Let me use this real story as an example!

I was listening to a senior team leader on the daily standup and he explained to us, how they can't solve the problem of saving an order and it's entries without errors. Because when they insert into the order table it's being saved and after that the entries are being inserted, and those sometime throw an error and the order will be partially saved ~ fuh ~

As a smart programmer You already know that he should have used a transaction. Worry not, this was my first reaction too. And the response surprised me the most. He explained that it's not possible because then they don't get a continuous sequential incremental identifier!

We know that there is no need for the identifier to be continuous, but there is something he stumbled upon. You may know that MySQL does not guarantees continuous sequential identifiers when you use the auto increment on a table. But why? Isn't is just a n+1?

No, it's not that simple. Our question is not what is n, but when is n?

When you initiate a transaction you temporarily sacrifice the consistency for partition tolerance. The default transaction isolation level allows the database to serve and process requests asynchronously. Deep in the database's source code, you will find workers whom are working from a queue, and they can operate on the same table or even on the same row independently. It would be disastrous if a database could execute a single query at a time ??????

By design, MySQL already applies the CAP constraints, so if you understand which constrains applied in which phase, then you can avoid locks, bottleneck, data inconsistency, and other unintended behaviors.

But MySQL is ACID compliant. So, I don't have to care about this!

One would think. But let's take a look at some scenarios when you have to.

Async orders

Trivia: MySQL actually uses a WAL (write ahead log) which hides it's event sourcing nature from you, but that's a topic for an another time ??

Now our situation is quite common, a high traffic eCommerce platform receives two order in the same milliseconds range, the web server will dispatch both of the request parallel and the language interpreter starts the sequential code execution. We can note that our web server is actually sacrificing the consistency, so the requests does not have to wait in queue to be processed. And the CAP is realized on the thread / process level.

Order flow

I know, terrible flow directions, but had to fit this screen ratio, sorry!!!!

We can reasonable assume that this is a bare bone implementation of a checkout flow. If you have sharp eyes you already noticed that I colored the "Insert order" to yellow, because this is where your application can misbehave. We have checked the stock for both order #113 and #114 both of them got the pass and we moved on. But there is a catch, if both of them ordered the same last item, then you going to have a bad time debugging how a simple "if" check can go wrong?! You did nothing wrong, simply the CAP got you, since you did everything in parallel even tho your code is sequential.

And now it's time to check back to the initial time travel, at the beginning of the article we had a small detour about starting languages. I love and enjoy those languages mentioned above, but I have a reasonable assumption which is; In their nature those programming languages are sequential and they perfectly describe in a top to bottom, left to right way what they do in which order. In the meantime languages with async nature prepares the programmer to not to trust even in the memory, if something can change it will change.

But is there a way to solve this?

There are myriad of ways, but let's quickly take a look what happened in our case:

Our customers are part our distributed network, they just run the UI part of it. In their case the system is not consistent because they always see a legacy data in their browser, but in exchange more then one visitor can visit your webpage. (terminals applied lock on users to solve this issue in the past)

Our Nginx server also sacrifices the consistency, so it can process multiple request. No one wants to run a server where only one customer can connect at once :D

Our PHP interpreted code sacrifices the partition tolerance, but in exchange you have a much easier way of programing. (for example NodeJS is not consistent but partition tolerant)

Our MySQL database sacrifices the consistency when executing a read operation, but it will exchange it for availability when you issue a write operation. (I could get into every possible situation, but I simplified here)

You can solve the above problem with eventual consistency but to achieve that you have to design your whole architecture to support it. In the near future we going to take a look at it, but we still need a lot of preparation before that.

---

I wanted to show you this rather common issue because it does not involve race conditions and time disagreements, and other even crazier problems. I didn't intend to solve the above problem in this article, but I hope it showed you, how you operate with the CAP's constraints and now that you know about them, you will see them everywhere and you will plan for them :)

You can read more about this topic at:

https://en.wikipedia.org/wiki/CAP_theorem

Have a nice day! ^.^





要查看或添加评论,请登录

Zsolt Varga的更多文章

  • GitHub AI Companion, Does it Worth it?

    GitHub AI Companion, Does it Worth it?

    In short: Yes! Thanks for reading, have a nice day..

  • Artgen - First Month's Summary

    Artgen - First Month's Summary

    It's been exactly one month ago when I made the decision to create a software which will disrupt the way we think about…

  • Offload Before Microservices!

    Offload Before Microservices!

    Today we will continue the series revolving around why You want to delay microservices as long as possible. In this…

  • You Don't Need Microservices!

    You Don't Need Microservices!

    I know I know, controversial topic. But in this article we are going to take a look at the need for microservice, and…

    2 条评论
  • How to Hire Programmers Efficiently Part #2

    How to Hire Programmers Efficiently Part #2

    In the first part we talked about the required years and summarized that the job post rather describe the position's…

  • How to Hire Programmers Efficiently Part #1

    How to Hire Programmers Efficiently Part #1

    We can all agree on one thing, hiring the right programmers is quite a challenge! I am sure, You are well aware what…

    1 条评论

社区洞察

其他会员也浏览了