Partitioning: Process of dividing data into independent segments.
- If the data is served without partitioning, every time a request comes to the machine, the whole data needs to be traversed, it will also be resource- and time-consuming.
- In partitioning, there is less load on partitioned systems, unlike non-partitioned systems in case of failure because the load will be independent in case of partitioned systems, unlike the latter.
- Partitioning is usually done in the form of indexes that are based on most used patterns.
- Provides increased fault tolerance and reliability.
- Used by NoSQL databases like CassandraDB, DynamoDB, etc.
The reason why these databases never offer immediate acknowledgment to write operations is that for a single write operation, multiple requests are generated to write in the main table and all different criteria-based partitions.
Replication: creating copies of data on multiple machines.
1. Primary Replication Protocols:
- All write requests are made by primary replicas and read requests are handled by backup replicas. The primary replica ensures that the request is also acknowledged by backup replicas to make sure no inconsistencies.
- This will be useful in the case of banking where each operation is critical and hence we can't afford a single mistake.
2. Consensus Replication Protocols:
- When any requests come more than half of the nodes need to acknowledge it successfully before getting it successful. Hence the name Consensus Protocol. eg: Raft and Paxos.