登录查看更多内容

Understanding Cluster Replication Scalability

Ricardo Jimenez-Peris

发布日期: 2021年4月8日

CLUSTER REPLICATION AND LOGARITHMIC SCALABILITY

If you have been using cluster replication with some open source operational database, you might have noticed that they do not scale out well. If you are interested in knowing why, this is the post to read. Cluster replication was introduced in the mid-1990s as a way to scale out databases. The basic idea, which is called full replication (most commonly known as cluster replication), is to have a cluster of server nodes, each of them running a database engine with a full copy of the database. But how do we keep all replicas consistent and up to date? The strategy typically used to update the replicas is ROWAA (Read One Write All Available), where each read operation is executed on any one replica while a write operation is executed on all replicas. So, what is the scalability of cluster replication? On one extreme of the scalability spectrum, if we only have writes in the workload, we have null scalability, since all replicas do the same and the cluster throughput is the same as that of a single node, i.e., it does not scale. On the other extreme, if we only have reads, assuming a uniform load across replicas, we have linear scalability, i.e., a cluster with n replicas has a global throughput equal to n times the one of a single node. In between, we have logarithmic scalability, that is, the cluster throughput only grows logarithmically when increasing the number of nodes. The reason is because the bigger the cluster size, the higher the wasted capacity per node. Figure 1 depicts graphically what happens. On the lower part, we see how many servers we have for a particular cluster size. The orange line indicates how much capacity of the servers is wasted, i.e., the space between the x axis to the orange line is the wasted capacity.

Figure 1: Logarithmic Scalability

SCALABILITY FACTOR

But how can we actually quantify scalability? We devote the rest of the post to it. Let’s develop our analytical model. Firstly we do it intuitively, and then we formalize it mathematically. A database with cluster replication is able to process a number of read and write operations, that is, it is able to deliver a certain maximum throughput. We can make the throughput relative to that of a single node, this is what is actually called the scale out factor [Jiménez-Peris et al. 2003]. To get the scale out factor, f, we simply divide the useful work, which is the actual throughput, by the total amount of work (see Figure 2). The optimal scale out factor is the size of the cluster. That is, for a cluster of n nodes, the optimal scale out is n.

Figure 2: Scale Out Factor

EFFICIENCY OF 1-NODE CLUSTER

Let’s consider a workload with 50% reads and 50% writes. For simplicity, assume the cost of reads and writes are the same. The single node will devote half of the capacity to execute writes and the other half to execute reads. If we execute a read and a write, the throughput will be 2 operations (the read and the write) and the work done 2 operations (the read and the write) as well, so f=2/2=1. This is easy (see Figure 1).

Figure 3: Single Node Cluster. Split of capacity between reads and writes for a workload of 50% writes.

EFFICIENCY OF 2-NODE CLUSTER

Let’s now look at a cluster of two nodes. Each node wants to execute one read and one write. However, each write executed at the other node also must be executed locally. We call it remote write. Thus, each node will do its local read, its local write, plus a remote write, meaning that 2/3 of the capacity of the nodes is employed for useful work. Note that this means we are wasting 1/3 of the capacity of each node doing remote writes. This is the price of full replication, executing writes everywhere. So each node of the two nodes does three operations: one read and one write plus the remote write, thus: f=2?(2/(1+1+1)=4/3=1.33. In other words, the two nodes deliver the same throughput as one node and one third of a node.

Figure 4: Two Node Cluster. Split of capacity between reads and writes for a workload of 50% writes.

EFFICIENCY OF 3-NODE CLUSTER

Let’s take a look at a three-node cluster and from there we can easily generalize the formula for an arbitrary cluster size. If we have 3 replicas, each replica processes 1 read and 1 write, but will also have to execute two remote writes corresponding to the writes from the other two replicas. Therefore, they execute four operations (the read, the write and two remote writes), but only two are useful work: f=3?(2/(2+1+1))=6/4=1.5. Having 3 replicas we attain throughput 1.5 times that of a single node, that is, half of the 3 node cluster capacity is wasted.

Figure 5: Three Node Cluster. Split of capacity between reads and writes for a workload of 50% writes.

SCALABILITY ANALYTICAL MODEL

Continue reading in LeanXcale blog.

要查看或添加评论，请登录

Ricardo Jimenez-Peris的更多文章

SQL, NoSQL & NewSQL

2022年5月25日

SQL, NoSQL & NewSQL

Until several years ago, the database world was simple and all SQL. Nowadays, we also have NoSQL and NewSQL.
Understanding the CAP Theorem and its No Relationship to Scalability

2021年7月8日

Understanding the CAP Theorem and its No Relationship to Scalability

What is the CAP theorem? It is actually a misnomer and a poorly understood result of distributed systems theory. Let’s…
The Case for Shared Nothing

2021年6月17日

The Case for Shared Nothing

Shared-nothing has become the dominant parallel architecture for big data systems, such as MapReduce and Spark…
Understanding Distributed Databases Scalability

2021年3月18日

Understanding Distributed Databases Scalability

SCALABILITY & PERFORMANCE What is scalability? Scalability is an overloaded term that has been perverted by technical…

2 条评论
Next 24th of May, at University of Waterloo

2019年4月25日

Next 24th of May, at University of Waterloo

Next 24th of May, I will be visiting University of Waterloo to present the ultra-scalable distributed algorithm, which…
My visit to the MIT

2019年4月3日

My visit to the MIT

Taking advantage of my visit to Boston due to our stand at Enterprise Data World, I visited a couple of groups in MIT…
LeanXcale closes a round of financing of 2 Million Euros

2018年11月1日

LeanXcale closes a round of financing of 2 Million Euros

I am glad to announce that LeanXcale has reached a new milestone and has closed a round of financing of two million…

12 条评论
The Times Today: CDO's must protect their company from Disruption in The Future Chief Data Officer Supplement

2018年5月30日

The Times Today: CDO's must protect their company from Disruption in The Future Chief Data Officer Supplement

The Times (print or tablet version) today (30th May 2018) with monographic on "The Future of the Chief Data Officer" is…

1 条评论
The Future Chief Data Officer in The Times Tomorrow (30th May 2018)

2018年5月29日

The Future Chief Data Officer in The Times Tomorrow (30th May 2018)

Be tuned to The Times newspaper issue published tomorrow 30th May. It is about The Future Chief Data Officer and…
LeanXcale “Best Young SME” award

2018年4月30日

LeanXcale “Best Young SME” award

Madri+d has featured LeanXcale as one of the most innovative startups in Europe due to its “Best Young SME” award given…

3 条评论

See all articles

Understanding Cluster Replication Scalability

Ricardo Jimenez-Peris

CLUSTER REPLICATION AND LOGARITHMIC SCALABILITY

SCALABILITY FACTOR

EFFICIENCY OF 1-NODE CLUSTER

EFFICIENCY OF 2-NODE CLUSTER

EFFICIENCY OF 3-NODE CLUSTER

SCALABILITY ANALYTICAL MODEL

Ricardo Jimenez-Peris的更多文章

社区洞察

其他会员也浏览了

Introduction to RedHat CEPH Storage

Vault implementation for static secrets

Streaming Replication in PostgreSQL – The Ultimate Guide for DBAs

MariaDB Master Slave Replication in Docker Containers

Understanding MongoDB Replication

System Design : SCALE FROM ZERO TO MILLIONS OF USERS: Part 2(Final)

How to manage incremental repair in Apache Cassandra? (Hands-On).

Uber’s Implementation of Tiered Storage in Kafka

Hashing in Distributed Systems: A Mathematical Deep Dive

Failover vs. Replication system Design part -6

CLUSTER REPLICATION AND LOGARITHMIC SCALABILITY

SCALABILITY FACTOR

EFFICIENCY OF 1-NODE CLUSTER

EFFICIENCY OF 2-NODE CLUSTER

EFFICIENCY OF 3-NODE CLUSTER

SCALABILITY ANALYTICAL MODEL

Ricardo Jimenez-Peris的更多文章

SQL, NoSQL & NewSQL

Understanding the CAP Theorem and its No Relationship to Scalability

The Case for Shared Nothing

Understanding Distributed Databases Scalability

Next 24th of May, at University of Waterloo

My visit to the MIT

LeanXcale closes a round of financing of 2 Million Euros

The Times Today: CDO's must protect their company from Disruption in The Future Chief Data Officer Supplement

The Future Chief Data Officer in The Times Tomorrow (30th May 2018)

LeanXcale “Best Young SME” award

社区洞察

其他会员也浏览了

Introduction to RedHat CEPH Storage

Vault implementation for static secrets

Streaming Replication in PostgreSQL – The Ultimate Guide for DBAs

MariaDB Master Slave Replication in Docker Containers

Understanding MongoDB Replication

System Design : SCALE FROM ZERO TO MILLIONS OF USERS: Part 2(Final)

How to manage incremental repair in Apache Cassandra? (Hands-On).

Uber’s Implementation of Tiered Storage in Kafka

Hashing in Distributed Systems: A Mathematical Deep Dive

Failover vs. Replication system Design part -6