登录查看更多内容

From RDS-Centric to Distributed Systems: An Evolution Towards Eventual Consistency and Simplified Development with Managed Services

杨刚

通过抽象、建模和编码解决复杂问题。

发布日期: 2025年2月12日

Introduction: The Shifting Sands of Application Architecture

For decades, Relational Database Systems (RDS) have been the bedrock of application development. Renowned for their strong consistency and adherence to ACID (Atomicity, Consistency, Isolation, Durability) properties, RDS offered a robust and predictable foundation for managing data. The development paradigm was often RDS-centric, with applications tightly coupled to a single database instance or a closely synchronized cluster, leveraging transactional operations (@Transactional) to ensure data integrity.

However, the demands of modern applications – characterized by massive scale, global reach, and the need for high availability – are pushing the boundaries of traditional RDS. The limitations of single-system architectures become apparent when facing:

Scalability Bottlenecks: Scaling RDS vertically can be expensive and eventually hits limits. Horizontal scaling with traditional synchronous replication introduces complexity and performance overhead.
Single Points of Failure: A single RDS instance can become a single point of failure, impacting the entire application's availability. Components/services/modules share the same compute resources, memory, and database connections. A failure( ex. memory leak) in one module (e.g., a reporting service) can crash the entire application.
Challenges with High Concurrency and Distribution: Managing concurrent access and distributing data across geographically dispersed users becomes increasingly complex and costly with traditional RDS( multi master locking vs latency ).
Monolithic systems introduce rigid dependencies: Tight coupling between components forces developers to rebuild and redeploy the entire application for minor changes (e.g., updating a payment module requires recompiling unrelated features), slowing iteration cycles. Language and framework lock-in limits flexibility—teams cannot adopt specialized tools (e.g., Python for machine learning or Go for high-concurrency tasks) without disrupting the unified codebase. Cross-team coordination overhead arises as developers must navigate shared code, conflicting logic, and synchronized releases, stifling agility and innovation.

To address these challenges, application architectures are evolving towards distributed systems, embracing concepts like eventual consistency. While distributed systems offer scalability and resilience, they introduce significant complexities in development and operations. This is where managed services step in, playing a crucial role in abstracting away the underlying IT complexities, allowing developers to focus on building business logic and functionality.

The Evolutionary Path: From RDS-Centric to Event-Driven Architectures

The journey from RDS-centric to distributed systems is often an incremental evolution, driven by the need to overcome limitations and enhance application capabilities. Here's a step-by-step look at this evolution:

Step 1: RDS-Centric Development and Client-Side Retries

Characteristics: Applications are tightly integrated with an RDS. Multi-step operations are often encapsulated within database transactions. In case of failures during these operations, the responsibility for handling errors and retries rests entirely on the client application.
Strengths: Simple development model for applications where strong consistency and transactional integrity are paramount and scale is not the primary concern.
Limitations: Poor scalability, limited fault tolerance, inefficient resource utilization due to full operation retries, and a negative impact on user experience due to visible errors and delays. Client-side retry logic adds complexity to client applications and can be unreliable.

Step 2: Introducing Queues for Client Requests

Improvement: To reduce reliance on client-side retries and enhance reliability, client requests are placed in a queue. A dedicated worker process then consumes requests from the queue and executes the multi-step operation.
Decoupling: Clients are decoupled from immediate processing, improving responsiveness. The queue itself provides persistence and at-least-once delivery, enhancing reliability.
Limitations: While client-side retries are mitigated, the entire multi-step operation is still retried by the worker if any step fails. This remains inefficient. The worker process can become a single point of failure. The underlying steps within the worker are still likely RDS-centric and transactional.

Step 3: Step-Level Queues and Intermediate Results (Refined)

Description: Queues are introduced between each step of the multi-step process, holding intermediate results. Dedicated worker processes consume from input queues and produce to output queues for each step, significantly improving retry efficiency and fault isolation at the step level.
Key Improvement: Step-level workers provide better resource utilization and prevent redundant re-execution of successful steps.
Still Implicit Monolith (The Limitation): However, even with step-level queues and workers, Step 3 often implicitly assumes that all these worker processes, along with the queue consumers and producers, are still deployed and managed as part of a single, albeit decomposed, application or system. This implies: 1) Deployment Coupling: While steps are logically separated, they might be deployed as one unit, limiting independent scaling. 2) Shared Resources: Worker processes might still share resources and dependencies within the same application environment. 3) Technology Homogeneity (Implicit): Steps are often implemented using the same technology stack within a single codebase.
The Question Arises: "If these steps are already communicating asynchronously via queues, and each step has its own dedicated worker, why are we still packaging them together as a single application? Why not make each step a truly independent service?"

Step 4: SOA and Messaging Services – Embracing Service Independence (Enhanced)

The "Aha!" Moment: Service Decomposition and Independence: Step 4 is driven by the realization that the queues have already provided the necessary decoupling to enable full service independence. The logical next step is to break free from the monolithic application mindset and treat each step as a self-contained, independently deployable service.
True Service-Oriented Architecture (SOA): Each step becomes a distinct service, responsible for a specific function and communicating with other services solely through messages via a shared messaging service (event bus).
Messaging Service as the Decoupling Enabler: The messaging service (e.g., Kafka, RabbitMQ, cloud-managed event bus) becomes the central nervous system, facilitating communication and event propagation between completely independent services.
Independent Deployment and Scaling: Each service can now be deployed, scaled, and updated independently, based on its specific needs and load. This provides true elasticity and scalability.
Enhanced Fault Isolation: Failures are truly isolated to individual services, improving overall system resilience.
Technology Diversity and Agility: Teams can choose the best technology stack for each service.
Team Autonomy and Ownership: Different teams can own and develop individual services.
Eventual Consistency as the Natural Outcome: With services fully decoupled and communicating asynchronously, eventual consistency becomes the natural and inherent consistency model for the system as a whole.

Eventual Consistency: Embracing the Inevitable in Distributed Systems (Revised)

Eventual consistency is a consistency model specifically designed for distributed systems, acknowledging the inherent challenges of maintaining immediate, strong consistency across multiple nodes. Unlike traditional Relational Database Systems (RDS) that rely on ACID transactions and rollback mechanisms to ensure data integrity, distributed systems often operate in environments where simple rollback across multiple services or nodes is either impossible or prohibitively expensive. This is often referred to as the "there is no ethernet" moment – in a truly distributed system, you cannot assume reliable, instantaneous communication and coordination to achieve atomic all-or-nothing operations across all components. (to be detailed in future article )

Key Concepts:

Distributed System: Components located on networked computers communicating via messages, where network failures and delays are inherent possibilities.
Data Consistency: All nodes having the same view of data at the same time. In a strongly consistent system (like RDS transactions), this is enforced immediately.
Eventual Consistency: Guarantees that if no new updates are made, all data replicas will eventually become consistent. Crucially, during the period of inconsistency, different parts of the system might have different views of the data. Rollback of a distributed operation across all services to a perfectly consistent prior state is generally not feasible. Instead, distributed systems must be designed to tolerate and manage these temporary inconsistencies and ensure eventual convergence to a consistent state.(to be detailed in future article )

领英推荐

? Long-lived connections in Kubernetes, Build your…

Learnk8s 4 个月前

Ensuring High Availability with Cassandra as a…

Home Credit Vietnam 1 年前

May 2023: Metamorphic testing, Oracle migrations, and…

Cockroach Labs 1 年前

Characteristics:

High Availability: Prioritizes system operation and accepting updates even during node failures or network partitions, even if it means temporary inconsistencies.
Scalability: Well-suited for highly scalable systems with many nodes and high operation volumes, where strong consistency would be a performance bottleneck.
Fault Tolerance: Designed to be resilient to node failures during update propagation, understanding that perfect, immediate consistency is not always achievable.
Lower Latency for Writes: Write operations can be faster as they don't require immediate synchronization across all nodes.
Simpler Implementation (Compared to Strong Consistency): Avoids complex distributed transaction protocols that attempt to mimic ACID properties in a distributed environment.

Trade-offs:

Read-After-Write Inconsistency: Immediate reads after writes might not reflect the update.
Non-Monotonic Reads: Rarely, older data versions might be read after newer ones.
Application Complexity: Applications must be designed to tolerate temporary inconsistencies and implement business logic that goes beyond simple rollback to handle failures and partial operations.

Real-World Examples: DNS, CDNs, Social Media Platforms, E-commerce Shopping Carts, Cloud Storage.

The Consistency Spectrum: Eventual consistency is not the only option in distributed systems. A spectrum of consistency models exists, offering different trade-offs (e.g., causal consistency, read-your-writes consistency, session consistency), allowing for more nuanced control over consistency guarantees depending on application needs.

Managed Services: Simplifying Distributed System Development

Managed services offered by cloud providers are revolutionizing distributed system development by abstracting away significant IT complexities. Services like DynamoDB Streams and Lambda on AWS exemplify this trend. This is particularly impactful when dealing with the inherent complexities of eventual consistency and distributed operations.

Abstraction of IT Complexity:

Reliable Messaging Infrastructure: Managed services provide robust, scalable, and fault-tolerant messaging infrastructure (queues, event buses) as a service. Developers don't need to build or manage this critical component, which is essential for implementing eventually consistent patterns.
Simplified Event Propagation: Services like DynamoDB Streams automatically capture database changes and propagate them as events. Lambda triggers simplify event consumption, automatically invoking functions upon new events, streamlining event-driven architectures.
Built-in Monitoring and Observability: Cloud platforms offer integrated monitoring and logging tools, providing visibility into event flows, latency, and system health, simplifying debugging and operations in distributed environments where tracing operations across services is crucial.

Shift to Business Logic Focus:

By abstracting away the complexities of building and managing distributed infrastructure, managed services empower developers to:

Focus on Business Logic: Concentrate on implementing core business functionality and solving business problems instead of dealing with low-level IT plumbing of distributed systems. This includes designing business logic that gracefully handles eventual consistency and potential failures, often requiring approaches that go beyond simple rollback.
Increase Development Velocity: Leverage pre-built, reliable components to accelerate development cycles and time-to-market for distributed applications.
Reduce Operational Burden: Offload operational responsibilities for distributed infrastructure to the cloud provider, reducing overhead and enabling leaner teams to manage complex systems.
Easily Adopt Event-Driven Architectures: Managed services democratize access to event-driven patterns and eventual consistency, making them more accessible and practical for a wider range of applications that benefit from distributed architectures.
Optimize Costs and Resources: Benefit from pay-as-you-go pricing and automatic scaling of distributed infrastructure, optimizing resource utilization and cost efficiency.

Developer Responsibilities Remain:

While managed services simplify development, developers still need to:

Understand Eventual Consistency: Design applications aware of potential temporary inconsistencies and the limitations of simple rollback in distributed environments. Patterns like Sagas emerge as one way to manage complex distributed operations and compensate for failures in eventually consistent systems, reflecting the need for business logic to handle scenarios beyond simple rollback.
Implement Idempotent Event Handlers: Ensure event processing functions are idempotent to handle at-least-once delivery in distributed messaging systems.
Address Conflict Resolution: Define application-specific logic for handling concurrent updates and potential data conflicts that are inherent in eventually consistent systems.
Monitor Application-Level Consistency: Track application-specific metrics to ensure acceptable consistency levels and latency in the overall distributed system.
Test in an Eventually Consistent Context: Employ testing strategies beyond unit tests (integration tests with time, chaos engineering, property-based testing) to validate the behavior and resilience of the distributed system.

A combination of these strategies provides a more comprehensive approach to building confidence in the reliability and consistency of eventually consistent systems.

Conclusion: Embracing the Future of Distributed Systems

The evolution from RDS-centric to distributed systems is driven by the ever-increasing demands for scalability, availability, and resilience in modern applications. Eventual consistency emerges as a pragmatic and powerful consistency model for these distributed environments. Managed services are playing a transformative role by abstracting away the complexities of building and managing distributed infrastructure, democratizing access to these powerful architectures. This shift empowers developers to focus on business logic, accelerate innovation, and build highly scalable and resilient applications, marking a significant step forward in the evolution of software development.

要查看或添加评论，请登录

杨刚的更多文章

AWS Serverless vs. Spring Boot in Kubernetes: A Brutally Honest Guide to Choosing Your Architecture

2025年3月1日

AWS Serverless vs. Spring Boot in Kubernetes: A Brutally Honest Guide to Choosing Your Architecture

Introduction The debate between serverless architectures (e.g.
Embracing Application-Centric Infrastructure in the Cloud 2

2025年2月28日

Embracing Application-Centric Infrastructure in the Cloud 2

AWS CDK for EKS: Falling Short in Real-World, Multi-Account Kubernetes Deployments AWS Cloud Development Kit (CDK) aims…
The Fragmentation Trap: How YAML/Container-Centric GitOps are Hindering Cloud-Native Evolution and Breed Organizational inefficiencies

2025年2月20日

The Fragmentation Trap: How YAML/Container-Centric GitOps are Hindering Cloud-Native Evolution and Breed Organizational inefficiencies

In the quest for cloud-native agility, GitOps has emerged as a powerful paradigm. However, a prevalent approach – one…
Embracing Application-Centric Infrastructure in the Cloud 1

2025年2月16日

Embracing Application-Centric Infrastructure in the Cloud 1

In the world of cloud computing, managing infrastructure and applications has often been a tale of two philosophies. On…
The Architect of Digital Trust: Public Key Infrastructure (PKI) in Kubernetes, SAML, OAuth 2.0, and JWTs – A Deep Dive

2025年2月16日

The Architect of Digital Trust: Public Key Infrastructure (PKI) in Kubernetes, SAML, OAuth 2.0, and JWTs – A Deep Dive

In the complex world of distributed systems and web architectures, secure authentication is the bedrock upon which…
LLMs Are Stateless API Calls: Comparing LangChain and AWS Step Functions?+?Bedrock (Enhanced with AWS CDK)

2025年2月15日

LLMs Are Stateless API Calls: Comparing LangChain and AWS Step Functions?+?Bedrock (Enhanced with AWS CDK)

Large language models (LLMs) are, at their core, stateless API calls. This means that each invocation of an LLM is…
Consistency Models in Software Engineering: Embracing Complexity and Innovation

2025年2月13日

Consistency Models in Software Engineering: Embracing Complexity and Innovation

In today's world of rapidly evolving software systems, consistency remains a cornerstone challenge. From transactional…
Face the Brutal Truth: Merging Code and Distributed Transaction Are Not Just Technical Problems

2025年2月12日

Face the Brutal Truth: Merging Code and Distributed Transaction Are Not Just Technical Problems

Why Ownership, Business Logic, and Conflict Archaeology Define Modern Engineering Introduction: The Illusion of Control…
The Consistency Spectrum: How "Good Enough" vs. "Absolutely Correct" Divides the SDE World

2025年2月11日

The Consistency Spectrum: How "Good Enough" vs. "Absolutely Correct" Divides the SDE World

Software Development Engineering (SDE) is often presented as a unified discipline, but beneath the surface lies a…
The PR Queue: A Humiliation for Software Engineering

2025年2月11日

The PR Queue: A Humiliation for Software Engineering

For years, the Pull Request (PR) queue has reigned supreme as the cornerstone of collaborative software development…

See all articles

From RDS-Centric to Distributed Systems: An Evolution Towards Eventual Consistency and Simplified Development with Managed Services

杨刚

通过抽象、建模和编码解决复杂问题。

Introduction: The Shifting Sands of Application Architecture

The Evolutionary Path: From RDS-Centric to Event-Driven Architectures

Eventual Consistency: Embracing the Inevitable in Distributed Systems (Revised)

领英推荐

Managed Services: Simplifying Distributed System Development

Conclusion: Embracing the Future of Distributed Systems

杨刚的更多文章

其他会员也浏览了

How to Optimize Kafka Topics and Messaging

An Introduction To Kubernetes

Kafka Mastery: Essential Strategies for Scaling, Best Practices, and Cost Efficiency

Building Stateful Applications on Kubernetes Best Practices

Speedb Launches Enterprise RocksDB Technical Support Program

AWS Aurora Serverless v2: The Future of Database Management

Building Blocks of Tech Brilliance: A Deep Dive into System Design Essentials

Modern, Cloud Native Application Design

Top 10 operational challenges in managing Kafka

Ensuring Data Reliability in Apache Kafka

Introduction: The Shifting Sands of Application Architecture

The Evolutionary Path: From RDS-Centric to Event-Driven Architectures

Eventual Consistency: Embracing the Inevitable in Distributed Systems (Revised)

领英推荐

Managed Services: Simplifying Distributed System Development

Conclusion: Embracing the Future of Distributed Systems

杨刚的更多文章

AWS Serverless vs. Spring Boot in Kubernetes: A Brutally Honest Guide to Choosing Your Architecture

Embracing Application-Centric Infrastructure in the Cloud 2

The Fragmentation Trap: How YAML/Container-Centric GitOps are Hindering Cloud-Native Evolution and Breed Organizational inefficiencies

Embracing Application-Centric Infrastructure in the Cloud 1

The Architect of Digital Trust: Public Key Infrastructure (PKI) in Kubernetes, SAML, OAuth 2.0, and JWTs – A Deep Dive

LLMs Are Stateless API Calls: Comparing LangChain and AWS Step Functions?+?Bedrock (Enhanced with AWS CDK)

Consistency Models in Software Engineering: Embracing Complexity and Innovation

Face the Brutal Truth: Merging Code and Distributed Transaction Are Not Just Technical Problems

The Consistency Spectrum: How "Good Enough" vs. "Absolutely Correct" Divides the SDE World

The PR Queue: A Humiliation for Software Engineering

其他会员也浏览了

How to Optimize Kafka Topics and Messaging

An Introduction To Kubernetes

Kafka Mastery: Essential Strategies for Scaling, Best Practices, and Cost Efficiency

Building Stateful Applications on Kubernetes Best Practices

Speedb Launches Enterprise RocksDB Technical Support Program

AWS Aurora Serverless v2: The Future of Database Management

Building Blocks of Tech Brilliance: A Deep Dive into System Design Essentials

Modern, Cloud Native Application Design

Top 10 operational challenges in managing Kafka

Ensuring Data Reliability in Apache Kafka