Distributing SQL Databases Globally
Kaivalya Apte
The GeekNarrator Podcast | Staff Engineer | Follow me for #distributedsystems #databases #interviewing #softwareengineering
There are several reasons, why you would want to distributed your database.
Distributing a stateless application is easy, but distribution of your state (data layer) is challenging. There are several ways you can approach distributing your database. Each way has its own pros and cons and understanding them is critical. In this article lets look at some ways you can Distribute your database across several geographical location.
Lets take a Pizza delivery Service as an example.
Approach 1: Scaling and Distributing with read replicas
However there are some limitations and challenges:
Advantages of this approach:
Approach 2: Deploying separate database per region
This approach is pretty straightforward, you just deploy a new database (shard) which serves a single region.
So as you can see in the above diagram, each region India, Germany and UK has their own database instance. Customers from respective regions are served from the local (to the country) database.
So when a customer creates an order, you route their request to a region specific app server which talks to a region specific database and everything is cleanly separated.
This is a simple model to understand and gives you an ability to scale each database (vertically, or with read replicas) separately as per needs. For ex: if your Pizza is more popular in India you can vertically scale the Indian database and other databases need no changes.
It is simple approach, but comes with some limitations and problems:
领英推荐
Approach 3: Using Natively Distributed Database (Cloud)
There are databases solutions available that support Geo-Distribution natively. So it makes a lot of sense to leverage their capabilities for a Geo-Distributed application. You just need to model your data keeping in mind the distribution of data. So as you can see in the above diagram, you have your Pizza orders table.
How is it different?
There are many Distributed SQL databases available in the market, but the leading ones are Google Spanner, Yugabyte DB and CockroachDB.
Here is a sneak peak of how your Distributed Application would look like:
To know more about Designing Geo Distributed applications you can head over to the podcast I did with Denis Magda from Yugabyte, where we have discussed everything you need to know about the various strategies.
I hope you enjoyed learning the article. Stay tuned! Subscribe to the new letter and The GeekNarrator youtube channel.
Cheers,
The GeekNarrator
Ph.D | PostgreSQL Contributor | Software Developer | Tech Blogger
1 年Good for reading. Keeping silent on distributed processing and load balancing, the author just tells us that this profit can’t be achieved, doesn’t it?
?? Head of Quality Engineer at GiaoHangNhanh (GHN) | ?? Quality Transformation | ?? Foster Quality Culture | ?? Provide Testing Solution
2 年Phu Nguyen Khoa Nguyen Tr.
Founder & Software Engineer | AI & Web App Development
2 年Terrific post! For some reason, I had a silly image come to mind about distributed pizza gone wrong ??: Pepperoni being sent to one region, cheese to another, sauce to another, etc...
Software Engineering and Developer Relations, MLP Launchpad
2 年Kaivalya Apte you nailed it! I couldn’t put it better. Exceptional summary of our conversation ?? Btw, if anyone is interested to learn more about high-availability of geo-apps, then welcome to a SpringOne stream next week: https://tanzu-dev-portal.netlify.app/developer/tv/golden-path/5/ And I remember that I still owe you an episode about indexes! On my to-do list.
Data enthusiast
2 年Excellent, Kaivalya Apte. The post reminds me of the traditional techniques such as mirroring, snap-shotting, log-shipping that were used for the same purposes: high-availability, serving faster reads for reporting applications, etc. All these involved a lot of manual work which the modern databases on the cloud have done away with. In Approach #2, where we have a separate database for each region, the other downside is 'making changes'. To keep the structure, schema of all databases in sync is a challenge. Keeping them different to accommodate localization is yet another challenge too. ????