Exploring Caching Patterns for Microservices Architecture

Exploring Caching Patterns for Microservices Architecture

Introduction

"Treat caching primarily as a performance optimization. Cache in as few places as possible to make it easier to reason about the fresh‐ness of data" -Newman, S.

In micro-services architecture, caching can be an effective technique to improve the performance and scalability of services. Caching involves storing frequently accessed data in memory, so that it can be quickly retrieved and reused without the need to repeatedly query the underlying data source.Lets start by talking about what sorts of problems caches can help with.

  • For Performance: Concerns over network latency and multiple interactions for data retrieval arise with icro-services. Caching data can address these issues by reducing network calls and avoiding the creation of data for each request, such as by caching expensive query results for popular items by genre.
  • For Scale: Caching reads can reduce contention and help with system scalability. Read replicas in databases are an example of this. Caching between clients and the origin can also reduce load on the origin, allowing it to scale.
  • For Robustness: enabling operation when the origin is unavailable. However, to ensure this, cache invalidation must keep stale data until updated, which may result in reading stale data, favoring availability over consistency. An example of this technique is periodically crawling the live site to generate a static version in case of an outage.

Where to Cache

Microservices provide caching options, with different cache locations having trade-offs. The choice of cache location depends on the optimization goal.

Client-side

With client-side caching, the data is cached outside the scope of the origin.

No alt text provided for this image


Client-side caching can improve latency and robustness but has downsides such as limited invalidation options and potential inconsistency between clients.

Shared client-side caching can mitigate the inconsistency problem and be more efficient but requires a round trip to the cache.

No alt text provided for this image

Consider the owner and implementation of the shared cache as it can blur the line between client-side and server-side caching.

Server-side

No alt text provided for this image


The origin microservice manages the cache and can implement advanced cache invalidation mechanisms more easily due to typical cache implementation methods. This also avoids the problem of different consumers seeing different cached values that can occur with client-side caching.

Request cache

No alt text provided for this image

A cached response for the initial request is stored in a request cache. Subsequent requests result in the cached result being returned. No lookups in the sales database needed, no round trips to album service this is far and away the most effective cache in terms of optimizing for speed.

The benefits here are obvious. This is super efficient, for one thing. However, we need to recognize that this form of caching is highly specific. We’ve only cached the result of this specific request. This means that other operations won’t be hitting the cache and thus won’t benefit in any way from this form of optimization

Where is my cache ?

Embedded distributed and none-distributed cache

No alt text provided for this image

The simplest possible caching pattern is Embedded Cache. In the diagram above, the flow is as follows:

  1. ?Request comes in to the Load Balancer
  2. ?Load Balancer forwards the request to one of the Application services
  3. ?Application receives the request and checks if the same request was already executed (and stored in cache)?If yes, then return the cached value,?If not, then perform the long-lasting business operation, store the result in the cache, and return the result

In the case of Spring, adding a caching layer requires nothing more than adding the @Cacheable annotation to the method.

@Servic
public class BookService {
    @Cacheable("books")
    public String getBookNameByIsbn(String isbn) {
        return findBookInSlowSource(isbn);
    }
}        
No alt text provided for this image


Embedded Distributed Cache is still the same pattern as Embedded Cache; In an embedded distributed cache, the cache is distributed across multiple nodes or servers, allowing for improved fault tolerance and scalability.

Examples of embedded distributed cache systems include Hazelcast, Ehcache, and Apache Ignite

Pros

  • Simple configuration and deployment
  • Low-latency data access
  • No separate Ops effort needed

Cons

  • Not flexible management (scaling, backup)
  • Limited to JVM-based applications
  • Data collocated with applications

Client-Server/Cloud Cache

No alt text provided for this image

This time, the flow presented on the diagram is as follows:

  1. Request comes into the Load Balancer and is forwarded to one of the Application service.
  2. Application uses cache client to connect to Cache Server, If there is no value found, then perform the usual business logic, cache the value, and return the response.

This architecture looks similar to the classic database architecture. We have a central server (or more precisely a cluster of servers) and applications connect to that server. If we were to compare Client-Server pattern with Embedded Cache, there are two main differences:

  • ???The first one is that the Cache Server is a separate unit in our architecture, which means that we can manage it separately (scale up/down, backups, security). However, it also means that it usually requires a separate Ops effort (or even a separate Ops team).
  • ???The second difference is that the application uses a cache client library to communicate with the cache and that means that we’re no longer limited to JVM-based languages. There is a well-defined protocol, and the programming language of the server part can be different than the client part. That is actually one of the reasons why many caching solutions, such as Redis or Memcached, offer only this pattern for their deployments.

No alt text provided for this image

In terms of the architecture, Cloud is like Client-Server, with the difference being that the server part is moved outside of your organization and is managed by your cloud provider, so you don’t have to worry about all of the organizational matters.

Pros

  • Data separate from applications
  • Separate management (scaling, backup)
  • Programming-language agnostic

Cons

  • Separate Ops effort
  • Higher latency
  • Server network requires adjustment (same region)

Side-car cache

No alt text provided for this image

The diagram above is Kubernetes-specific, because the Sidecar pattern is mostly seen in (but not limited to) Kubernetes environments. In Kubernetes, a deployment unit is called a POD. This POD contains one or more containers which are always deployed on the same physical machine. Usually, a POD contains only one container with the application itself. However, in some cases, you can include not only the application container but some additional containers which provide additional functionalities. These containers are called sidecar containers.

This time, the flow looks as follows:

  1. ?Request comes to the Kubernetes Service (Load Balancer) and is forwarded to one of the PODs
  2. Request comes to the Application Container and Application uses the cache client to connect to the Cache Container (technically Cache Server is always available at localhost)

This solution is a mixture of the Embedded and Client-Server patterns. It’s similar to Embedded Cache, because:

  1. ?Cache is always at the same machine as the application (low latency)
  2. Resource pool and management activities are shared between cache and application
  3. Cache cluster discovery is not an issue (it’s always available at localhost)

It’s also similar to the Client-Server pattern, because:

  1. Application can be written in any programming language (it uses the cache client library for communication)
  2. There is some isolation of cache and application

Pros

  • Simple configuration
  • Programming-language agnostic
  • Low latency
  • Some isolation of data and applications

Cons

  • Limited to container-based environments
  • Not flexible management (scaling, backup)
  • Data collocated with application PODs

Reverse proxy cache

No alt text provided for this image


So far, in each scenario, the application was aware that it uses a cache. This time, however, we put the caching part in front of the application, so the flow looks as follows:

  1. Request comes in to the Load Balancer
  2. Load Balancer checks if such a request is already cached
  3. If yes, then the response is sent back and the Request is not forwarded to the Application

Such a caching solution is based on the protocol level, so in most cases, it’s based on HTTP, which has some good and bad implications:

  • The good thing is that you can specify the caching layer as a configuration, so you don’t need to change any code in your application.
  • The bad thing is that you cannot use any application-based code to invalidate the cache, so the invalidation must be based on timeouts (and standard HTTP TTL, ETag, etc.).

NGINX provides a mature reverse proxy caching solution; however, data kept in its cache is not distributed, not highly available, and the data is stored on disk.

Pros

  • Configuration-based (no need to change applications)
  • Programming-language agnostic
  • Consistent with containers and the microservice world

Cons

  • Difficult cache invalidation
  • No mature solutions yet
  • Protocol-based (e.g., works only with HTTP)


Invalidation

There are only two hard things in Computer Science: cache invalidation and naming things.- Phil Karlton

Invalidation refers to the process of removing data from the cache. Although the concept is simple,In a microservice architecture, there are several invalidation options to consider.

Time to live (TTL)

In this technique, each item in the cache is assigned a time to live value, which specifies how long the item can remain in the cache before it is considered stale and must be invalidated. When an item is requested from the cache, the cache checks its time to live value to determine if the item is still fresh or if it has expired. If the item has expired, the cache invalidates the item and fetches a fresh copy from the origin before returning the data to the client.

Pros

  • Simplicity. Since the cache automatically invalidates items based on their time to live value.
  • No need for explicit invalidation logic or communication between the cache and origin server to maintain cache consistency.
  • TTL cache invalidation can help ensure that clients are always served fresh data by automatically invalidating stale data after a predetermined amount of time.

Cons

  • TTL cache invalidation can introduce the risk of stale data being served to clients if the time to live value is set too high.
  • If the time to live value is set too low, it can result in a high volume of cache misses and increased load on the origin server.

Conditional GETs

In this technique, the client sends a conditional GET request to the server, including an ETag or Last-Modified header in the request. The server then checks whether the requested resource has been modified since the client's last request by comparing the ETag or Last-Modified header with the current state of the resource. If the resource has not been modified, the server returns a 304 Not Modified response, indicating that the cached copy of the resource is still valid. If the resource has been modified, the server returns a 200 OK response along with the updated resource, which the client can then use to update its cache.

Pros

  • It reduces the amount of network traffic and server load required to maintain cache consistency. Since the server only needs to send the updated resource when it has been modified.

Cons

  • Requires support for ETags or Last-Modified headers in both the client and server, which can add complexity to the implementation.

Notification-based

In this mechanism, the origin server sends notifications to cache subscribers when a change occurs that might impact their cached data. Cache subscribers receive these notifications and invalidate their local cache entries, forcing them to fetch the latest data from the origin server.

Pros

  • It reduces the potential window in which outdated data can be served by the cache. This window is limited to the time taken for the notification to be sent and processed

Cons

  • Complex, It requires the origin server to emit notifications and cache subscribers(pub-sub style) to respond to them.

Write-through

When a client requests data from the cache, if the data is not present, the cache retrieves the data from the origin server and stores it in the cache before returning the data to the client.

If the data is already present in the cache and an update is made to the data on the origin server, the cache is updated synchronously before the updated data is returned to the client.

This ensures that the cache always has the latest version of the data and reduces the likelihood of stale data being served.

Pros

  • Simplicity and ease of implementation. Since updates are performed synchronously, there is no need for complex invalidation mechanisms or notification systems.
  • There is no risk of serving stale data to clients.

Cons

  • Write-through cache invalidation can also introduce performance overhead. Since data updates are performed synchronously.

Write-behind

In this technique, when data is updated on the origin server, the cache is not updated immediately. Instead, the update is cached locally and written back to the origin server asynchronously at a later time, usually in batches. When a client requests data from the cache, if the data is not present, the cache retrieves the data from the origin server and stores it in the cache before returning the data to the client.

If the data is already present in the cache and an update is made to the data on the origin server, the cache is not updated synchronously. Instead, the cache updates its local copy of the data and schedules the update to be written back to the origin server at a later time. This delay in updating the origin server can introduce the possibility of stale data being served to clients until the update is eventually written back.

Pros

  • Improved write performance. Since updates are performed asynchronously,
  • Batching updates can further improve performance by reducing the number of individual updates that must be sent to the origin server.

Cons

  • Introduce the risk of stale data being served to clients until the cache is eventually updated. This risk can be mitigated through careful tuning of the cache and by implementing additional techniques such as TTL-based invalidation.


References:

Priyavrat Shukla

Manager Experience Engineering @ Publicis Groupe | Gen AI enabled Web Development

1 年

Very well documented ??

Qusai Tamer

mid back-end developer

2 年

????

回复

要查看或添加评论,请登录

Saeed Anabtawi的更多文章

社区洞察

其他会员也浏览了