The multi-threaded program that NEVER has a race condition
Let's assume we were to design an email registration system. How would we do this?
- The basic rule to follow is that no 2 users can have the same email
- The front end, which is a simple HTML page, is already implemented. We are responsible for the middle tier and back end (which is a relational database)
- The user enters an email and ask for the page to register it. The front end does validation to see if the email is well formed and then sends it to the middle tier. A very simple algorithm for the middle tier would be as follows:
- Query the DB to see if the email is already registered
- If not taken, store this email in DB and return success to user
- Else return error
Suppose 2 users ask for the same email at the same time. In step 1, we query and determine the email is available, so we try to execute step 2. Now if we allow the write to happen without ACID transaction, this would result in a race condition where both the users get the same email (which would be a bug). So we rely on the ACID properties of the database to ensure we don't store the same email for 2 users. In layman terms, we cannot write multiple values to the database simultaneously. The writes have to be in sequence. If there are 100 users interacting with the system, and they all ask the system to register an email, then these will be 100 requests to the back-end database that gets executed in sequence. In other words, no matter how many threads we execute on the front-end, the back-end is always single threaded.
Now assuming our original system was designed for a few hundred users and the user base kept on growing. Eventually we will come to a point where we have to upgrade our software and storage. This would require some downtime as data is migrated to the new system.
If we increase the unique data points in our system (say the email has to be unique and the name has to be unique), then the complexity of our program increases by at-least a factor of two, and the performance of our program will degrade due to managing increasing complex data and the relationship within it.
Let's assume nature uses the same algorithm to assign a finger print (which is unique) to a human. The system started with a single human, and has now scaled to 7+ billion concurrent users with no maintenance downtime and without ever assigning a duplicate to a human. Moreover, the users never experience a wait during processing regardless of the load on the system, meaning the front-end and the back-end are all multi-threaded. And nature manages more than a few unique data points within humans without any degradation in performance of it's program.
It truly is amazing how the natural system has been functioning for thousands of years (if not more), scaling to 7+ billion concurrent users all without having a single bug and race condition!