登录查看更多内容

Exploring Caching Patterns for Microservices Architecture

Saeed Anabtawi

Founder @ interview.ps | Mentor @GazaSkyGeeks | Content Creator @ CodeWithSaeed

发布日期: 2023年3月11日

Introduction

"Treat caching primarily as a performance optimization. Cache in as few places as possible to make it easier to reason about the fresh‐ness of data" -Newman, S.

In micro-services architecture, caching can be an effective technique to improve the performance and scalability of services. Caching involves storing frequently accessed data in memory, so that it can be quickly retrieved and reused without the need to repeatedly query the underlying data source.Lets start by talking about what sorts of problems caches can help with.

For Performance: Concerns over network latency and multiple interactions for data retrieval arise with icro-services. Caching data can address these issues by reducing network calls and avoiding the creation of data for each request, such as by caching expensive query results for popular items by genre.
For Scale: Caching reads can reduce contention and help with system scalability. Read replicas in databases are an example of this. Caching between clients and the origin can also reduce load on the origin, allowing it to scale.
For Robustness: enabling operation when the origin is unavailable. However, to ensure this, cache invalidation must keep stale data until updated, which may result in reading stale data, favoring availability over consistency. An example of this technique is periodically crawling the live site to generate a static version in case of an outage.

Where to Cache

Microservices provide caching options, with different cache locations having trade-offs. The choice of cache location depends on the optimization goal.

Client-side

With client-side caching, the data is cached outside the scope of the origin.

Client-side caching can improve latency and robustness but has downsides such as limited invalidation options and potential inconsistency between clients.

Shared client-side caching can mitigate the inconsistency problem and be more efficient but requires a round trip to the cache.

Consider the owner and implementation of the shared cache as it can blur the line between client-side and server-side caching.

Server-side

The origin microservice manages the cache and can implement advanced cache invalidation mechanisms more easily due to typical cache implementation methods. This also avoids the problem of different consumers seeing different cached values that can occur with client-side caching.

Request cache

A cached response for the initial request is stored in a request cache. Subsequent requests result in the cached result being returned. No lookups in the sales database needed, no round trips to album service this is far and away the most effective cache in terms of optimizing for speed.

The benefits here are obvious. This is super efficient, for one thing. However, we need to recognize that this form of caching is highly specific. We’ve only cached the result of this specific request. This means that other operations won’t be hitting the cache and thus won’t benefit in any way from this form of optimization

Where is my cache ?

Embedded distributed and none-distributed cache

The simplest possible caching pattern is Embedded Cache. In the diagram above, the flow is as follows:

?Request comes in to the Load Balancer
?Load Balancer forwards the request to one of the Application services
?Application receives the request and checks if the same request was already executed (and stored in cache)?If yes, then return the cached value,?If not, then perform the long-lasting business operation, store the result in the cache, and return the result

In the case of Spring, adding a caching layer requires nothing more than adding the @Cacheable annotation to the method.

@Servic
public class BookService {
    @Cacheable("books")
    public String getBookNameByIsbn(String isbn) {
        return findBookInSlowSource(isbn);
    }
}

Embedded Distributed Cache is still the same pattern as Embedded Cache; In an embedded distributed cache, the cache is distributed across multiple nodes or servers, allowing for improved fault tolerance and scalability.

Examples of embedded distributed cache systems include Hazelcast, Ehcache, and Apache Ignite

Pros

Simple configuration and deployment
Low-latency data access
No separate Ops effort needed

Cons

Not flexible management (scaling, backup)
Limited to JVM-based applications
Data collocated with applications

Client-Server/Cloud Cache

This time, the flow presented on the diagram is as follows:

Request comes into the Load Balancer and is forwarded to one of the Application service.
Application uses cache client to connect to Cache Server, If there is no value found, then perform the usual business logic, cache the value, and return the response.

This architecture looks similar to the classic database architecture. We have a central server (or more precisely a cluster of servers) and applications connect to that server. If we were to compare Client-Server pattern with Embedded Cache, there are two main differences:

???The first one is that the Cache Server is a separate unit in our architecture, which means that we can manage it separately (scale up/down, backups, security). However, it also means that it usually requires a separate Ops effort (or even a separate Ops team).
???The second difference is that the application uses a cache client library to communicate with the cache and that means that we’re no longer limited to JVM-based languages. There is a well-defined protocol, and the programming language of the server part can be different than the client part. That is actually one of the reasons why many caching solutions, such as Redis or Memcached, offer only this pattern for their deployments.

In terms of the architecture, Cloud is like Client-Server, with the difference being that the server part is moved outside of your organization and is managed by your cloud provider, so you don’t have to worry about all of the organizational matters.

Pros

Data separate from applications
Separate management (scaling, backup)
Programming-language agnostic

Cons

Separate Ops effort
Higher latency
Server network requires adjustment (same region)

Side-car cache

The diagram above is Kubernetes-specific, because the Sidecar pattern is mostly seen in (but not limited to) Kubernetes environments. In Kubernetes, a deployment unit is called a POD. This POD contains one or more containers which are always deployed on the same physical machine. Usually, a POD contains only one container with the application itself. However, in some cases, you can include not only the application container but some additional containers which provide additional functionalities. These containers are called sidecar containers.

This time, the flow looks as follows:

?Request comes to the Kubernetes Service (Load Balancer) and is forwarded to one of the PODs
Request comes to the Application Container and Application uses the cache client to connect to the Cache Container (technically Cache Server is always available at localhost)

This solution is a mixture of the Embedded and Client-Server patterns. It’s similar to Embedded Cache, because:

?Cache is always at the same machine as the application (low latency)
Resource pool and management activities are shared between cache and application
Cache cluster discovery is not an issue (it’s always available at localhost)

It’s also similar to the Client-Server pattern, because:

领英推荐

Caching Strategies in Distributed Systems

David Shergilashvili 2 个月前

Caching - Evolving your Architecture

Saurav Prateek 3 年前

Advanced Caching Techniques for GraphQL APIs

Centizen, Inc. 11 个月前

Application can be written in any programming language (it uses the cache client library for communication)
There is some isolation of cache and application

Pros

Simple configuration
Programming-language agnostic
Low latency
Some isolation of data and applications

Cons

Limited to container-based environments
Not flexible management (scaling, backup)
Data collocated with application PODs

Reverse proxy cache

So far, in each scenario, the application was aware that it uses a cache. This time, however, we put the caching part in front of the application, so the flow looks as follows:

Request comes in to the Load Balancer
Load Balancer checks if such a request is already cached
If yes, then the response is sent back and the Request is not forwarded to the Application

Such a caching solution is based on the protocol level, so in most cases, it’s based on HTTP, which has some good and bad implications:

The good thing is that you can specify the caching layer as a configuration, so you don’t need to change any code in your application.
The bad thing is that you cannot use any application-based code to invalidate the cache, so the invalidation must be based on timeouts (and standard HTTP TTL, ETag, etc.).

NGINX provides a mature reverse proxy caching solution; however, data kept in its cache is not distributed, not highly available, and the data is stored on disk.

Pros

Configuration-based (no need to change applications)
Programming-language agnostic
Consistent with containers and the microservice world

Cons

Difficult cache invalidation
No mature solutions yet
Protocol-based (e.g., works only with HTTP)

Invalidation

There are only two hard things in Computer Science: cache invalidation and naming things.- Phil Karlton

Invalidation refers to the process of removing data from the cache. Although the concept is simple,In a microservice architecture, there are several invalidation options to consider.

Time to live (TTL)

In this technique, each item in the cache is assigned a time to live value, which specifies how long the item can remain in the cache before it is considered stale and must be invalidated. When an item is requested from the cache, the cache checks its time to live value to determine if the item is still fresh or if it has expired. If the item has expired, the cache invalidates the item and fetches a fresh copy from the origin before returning the data to the client.

Pros

Simplicity. Since the cache automatically invalidates items based on their time to live value.
No need for explicit invalidation logic or communication between the cache and origin server to maintain cache consistency.
TTL cache invalidation can help ensure that clients are always served fresh data by automatically invalidating stale data after a predetermined amount of time.

Cons

TTL cache invalidation can introduce the risk of stale data being served to clients if the time to live value is set too high.
If the time to live value is set too low, it can result in a high volume of cache misses and increased load on the origin server.

Conditional GETs

In this technique, the client sends a conditional GET request to the server, including an ETag or Last-Modified header in the request. The server then checks whether the requested resource has been modified since the client's last request by comparing the ETag or Last-Modified header with the current state of the resource. If the resource has not been modified, the server returns a 304 Not Modified response, indicating that the cached copy of the resource is still valid. If the resource has been modified, the server returns a 200 OK response along with the updated resource, which the client can then use to update its cache.

Pros

It reduces the amount of network traffic and server load required to maintain cache consistency. Since the server only needs to send the updated resource when it has been modified.

Cons

Requires support for ETags or Last-Modified headers in both the client and server, which can add complexity to the implementation.

Notification-based

In this mechanism, the origin server sends notifications to cache subscribers when a change occurs that might impact their cached data. Cache subscribers receive these notifications and invalidate their local cache entries, forcing them to fetch the latest data from the origin server.

Pros

It reduces the potential window in which outdated data can be served by the cache. This window is limited to the time taken for the notification to be sent and processed

Cons

Complex, It requires the origin server to emit notifications and cache subscribers(pub-sub style) to respond to them.

Write-through

When a client requests data from the cache, if the data is not present, the cache retrieves the data from the origin server and stores it in the cache before returning the data to the client.

If the data is already present in the cache and an update is made to the data on the origin server, the cache is updated synchronously before the updated data is returned to the client.

This ensures that the cache always has the latest version of the data and reduces the likelihood of stale data being served.

Pros

Simplicity and ease of implementation. Since updates are performed synchronously, there is no need for complex invalidation mechanisms or notification systems.
There is no risk of serving stale data to clients.

Cons

Write-through cache invalidation can also introduce performance overhead. Since data updates are performed synchronously.

Write-behind

In this technique, when data is updated on the origin server, the cache is not updated immediately. Instead, the update is cached locally and written back to the origin server asynchronously at a later time, usually in batches. When a client requests data from the cache, if the data is not present, the cache retrieves the data from the origin server and stores it in the cache before returning the data to the client.

If the data is already present in the cache and an update is made to the data on the origin server, the cache is not updated synchronously. Instead, the cache updates its local copy of the data and schedules the update to be written back to the origin server at a later time. This delay in updating the origin server can introduce the possibility of stale data being served to clients until the update is eventually written back.

Pros

Improved write performance. Since updates are performed asynchronously,
Batching updates can further improve performance by reducing the number of individual updates that must be sent to the origin server.

Cons

Introduce the risk of stale data being served to clients until the cache is eventually updated. This risk can be mitigated through careful tuning of the cache and by implementing additional techniques such as TTL-based invalidation.

References:

Newman, S. (2015). Building Microservices: Designing Fine-Grained Systems. O'Reilly Media, Inc.
Where Is My Cache? Architectural Patterns for Caching Microservices

CodeWithSaeed

1,628 位关注者

Priyavrat Shukla

Manager Experience Engineering @ Publicis Groupe | Gen AI enabled Web Development

1 年

Very well documented ??

1 次回应

Qusai Tamer

mid back-end developer

2 年

????

查看更多评论

要查看或添加评论，请登录

Saeed Anabtawi的更多文章

Developing a Personal Radar

2025年2月2日

Developing a Personal Radar

The Technology Advisory Board (TAB) at ThoughtWorks, a group of senior technology leaders, introduced the Technology…
What to Expect from a Manager

2024年11月20日

What to Expect from a Manager

Everyone’s first experience with management begins from the other side of the table—as someone being managed. This…
Non-Atomic Access to Concurrently Updated Data Structures

2024年9月7日

Non-Atomic Access to Concurrently Updated Data Structures

When working with data shared across multiple threads, ensuring thread safety is critical. Non-atomic access to…
Understanding JPA Session Boundaries and Transactions

2024年8月6日

Understanding JPA Session Boundaries and Transactions

Understanding the distinction between transaction boundaries and JPA session boundaries is crucial for efficient data…
Enhancing Software Quality with ArcUnit Testing in Java

2024年6月1日

Enhancing Software Quality with ArcUnit Testing in Java

Introduction ArchUnit is a powerful tool for maintaining the architecture of your Java projects. It helps ensure your…
Master Java Date and Time API: Everything You Need to Know

2024年5月18日

Master Java Date and Time API: Everything You Need to Know

In Java programming, managing date and time data accurately is crucial, especially when dealing with applications that…
System Design Concepts: Throughput and Latency

2024年4月17日

System Design Concepts: Throughput and Latency

Latency is the delay in network communication. It shows the time that data takes to transfer across the network.

3 条评论
Understanding Conflict Resolution

2024年4月9日

Understanding Conflict Resolution

Conflict resolution is the process by which two or more parties seek a peaceful solution to their disagreement. This…
System Design Key Concepts: Scalability

2024年4月6日

System Design Key Concepts: Scalability

Making a system work for just one person differs significantly from working for 10,000 or even a million people. As…
One Billion Row Challenge in Java: Part 4 - in Less than 10 seconds

2024年3月15日

One Billion Row Challenge in Java: Part 4 - in Less than 10 seconds

Introduction The One Billion Row Challenge (1BRC) is a fun exploration of how far modern Java can be pushed for…

See all articles

Exploring Caching Patterns for Microservices Architecture

Saeed Anabtawi

Founder @ interview.ps | Mentor @GazaSkyGeeks | Content Creator @ CodeWithSaeed

Introduction

Where to Cache

Client-side

Server-side

Request cache

Where is my cache ?

Embedded distributed and none-distributed cache

Client-Server/Cloud Cache

Side-car cache

领英推荐

Reverse proxy cache

Invalidation

Time to live (TTL)

Conditional GETs

Notification-based

Write-through

Write-behind

References:

CodeWithSaeed

1,628 位关注者

Saeed Anabtawi的更多文章

社区洞察

其他会员也浏览了

Navigating the Scalability Maze: Ensuring Robust Performance Under Growing User Loads

Integrating GraphQL with Serverless Architectures: Opportunities and Challenges

Mastering Distributed Cache: A Blueprint for Scalability, Performance, and Availability

Unveiling the Power of Caching: Redis Revolutionizes Data Performance

Beyond Caching

Distributed Transaction Handling in Microservice Architecture

Scaling Up Series: Caching Techniques with Redis

Caching in Spring Boot

Boosting Application Performance with Smart Caching Strategies

Spring Boot Caching Demystified: Unlock Faster Performance

Introduction

Where to Cache

Client-side

Server-side

Request cache

Where is my cache ?

Embedded distributed and none-distributed cache

Client-Server/Cloud Cache

Side-car cache

领英推荐

Reverse proxy cache

Invalidation

Time to live (TTL)

Conditional GETs

Notification-based

Write-through

Write-behind

References:

CodeWithSaeed

1,628 位关注者

Saeed Anabtawi的更多文章

Developing a Personal Radar

What to Expect from a Manager

Non-Atomic Access to Concurrently Updated Data Structures

Understanding JPA Session Boundaries and Transactions

Enhancing Software Quality with ArcUnit Testing in Java

Master Java Date and Time API: Everything You Need to Know

System Design Concepts: Throughput and Latency

Understanding Conflict Resolution

System Design Key Concepts: Scalability

One Billion Row Challenge in Java: Part 4 - in Less than 10 seconds

社区洞察

其他会员也浏览了

Navigating the Scalability Maze: Ensuring Robust Performance Under Growing User Loads

Integrating GraphQL with Serverless Architectures: Opportunities and Challenges

Mastering Distributed Cache: A Blueprint for Scalability, Performance, and Availability

Unveiling the Power of Caching: Redis Revolutionizes Data Performance

Beyond Caching

Distributed Transaction Handling in Microservice Architecture

Scaling Up Series: Caching Techniques with Redis

Caching in Spring Boot

Boosting Application Performance with Smart Caching Strategies

Spring Boot Caching Demystified: Unlock Faster Performance