登录查看更多内容

Caching

Prachi Gupta

Engineering @ Adidas | Java, Microservices, SpringBoot, Rest API, Kubernetes

发布日期: 2024年1月6日

Load balancing helps you scale horizontally across an ever-increasing number of servers, but caching will enable you to make vastly better use of the resources you already have.

Caching is a technique that involves temporarily storing data to avoid loading it more than once. Cached data is meant to be available in an instant to provide lightning-fast performance.

By caching data, you can improve your application’s performance while reducing network calls, database strain, and bandwidth usage. These benefits make it a great pattern to implement.

Caching consists of:

pre-calculating results (e.g. the number of visits from each referring domain for the previous day)
pre-generating expensive indexes (e.g. suggested stories based on a user’s click history), and
storing copies of frequently accessed data in a faster backend (e.g.?Memcache?instead of?PostgreSQL.

Caching is everywhere. Server-side, client-side, browser caching, and proxy/CDN caching are all opportunities for you to take advantage of the concept.

Caches in different layers

1. Client-side

Use case: Accelerate retrieval of web content from websites (browser or device)
Tech: HTTP Cache Headers, Browsers

2. DNS

Use case: Domain to IP Resolution
Tech: DNS Servers, Solutions: Amazon Route 53

3. Web Server

Use case: Accelerate retrieval of web content from web/app servers. Manage Web Sessions (server-side)
Tech: HTTP Cache Headers, CDNs, Reverse Proxies, Web Accelerators, Key/Value Stores
Solutions: Amazon CloudFront, ElastiCache for Redis, ElastiCache for Memcached, Partner Solutions

4. Application

Use case: Accelerate application performance and data access
Tech: Key/Value data stores, Local caches
Solutions: Redis, Memcached
Note: Basically it keeps a cache directly on the Application server. Each time a request is made to the service, the node will quickly return local, cached data if it exists. If not, the requesting node will query the data by going to network storage such as a database.
When the application server is expanded to many nodes, we may face the following issues: 1. The load balancer randomly distributes requests across the nodes. 2. The same request can go to different nodes, increase cache misses. 3. Extra storage since the same data will be stored in two or more different nodes. Solutions for the issues: 1. Global caches 2. Distributed caches

5. Database

Use case: Reduce latency associated with database query requests
Tech: Database buffers, Key/Value data stores
Solutions: The database usually includes some level of caching in a default configuration, optimised for a generic use case. Tweaking these settings for specific usage patterns can further boost performance, can also use Redis, Memcached

6. Content Distribution Network (CDN)

Use case: Take the burden of serving static media off of your application servers and provide a geographic distribution.
Solutions: Amazon CloudFront, Akamai

Types of caching Strategies

Your caching strategy depends on how your application reads and writes data. Is your application write-heavy, or is data written once and read frequently? Is the data that's returned always unique? Different data access patterns will influence how you configure a cache. Common caching types include cache-aside, read-through/write-through, and write-behind/write-back.

1. Write-Through cache

In Write-through caching, the application writes data to the cache and then to the database. The cache sit in line with the database, and the application treats them as the main data store. The cache is responsible for writing the data to the database.

Application adds/updates entry in cache
Cache synchronously writes entry to data store
Return

Write-through is a slow overall operation due to the write operation, but subsequent reads of just written data are fast. Users are generally more tolerant of latency when updating data than reading data. Data in the cache is not stale.

Pros: Complete data consistency, robust to system disruptions.

Cons:

When a new node is created due to failure or scaling, the new node will not cache entries until the entry is updated in the database. Cache-aside in conjunction with write through can mitigate this issue.
Most data written might never be read, which can be minimized with a TTL.

领英推荐

Caching with Redis & Memcached

Huzaifa Asif 1 年前

Advanced Caching Techniques for GraphQL APIs

Centizen, Inc. 11 个月前

Spring Boot Caching with Redis

Ahmed Abdelaziz 6 个月前

2. Read-Through cache

Read-through caching is a strategy where data is read from the cache, and if the data is not found, it is automatically loaded from the data source and added to the cache.

Read data from the cache
Read data from the database on a cache miss
Write data to the cache
And return data

The app doesn't interact with the database. But the cache does. In order words, cache is responsible for reading the data from the database.

And this is what makes it different from the cache aside pattern.

3. Write-Behind/ Write-Back cache

In write-behind, the application does the following:

Add/update entry in cache
Asynchronously write entry to the data store, improving write performance

Write-behind caches, sometimes known as write-back caches, are best for write-heavy workloads, and they improve write performance because the application doesn't need to wait for the write to complete before moving to the next task.

Cons:

There could be data loss if the cache goes down prior to its contents hitting the data store.
It is more complex to implement write-behind than it is to implement cache-aside or write-through.

4. Cache-Aside Pattern

Read data from the cache
Read data from the database in case of cache miss
Write data to the cache
And return data

Application is responsible for reading & writing from the database and the cache doesn't interact with the database as all. The cache is "kept aside" as a faster and more scalable in memory data store.

Memcached?is generally used in this manner.

Subsequent reads of data added to cache are fast. Cache-aside is also referred to as lazy loading. Only requested data is cached, which avoids filling up the cache with data that isn't requested.

Cons:

Each cache miss results in extra trips, which can cause a noticeable delay.
Data can become stale if it is updated in the database. This issue is mitigated by setting a time-to-live (TTL) which forces an update of the cache entry, or by using write-through.

References:

https://lethain.com/introduction-to-architecting-systems-for-scale/

https://aws.amazon.com/caching/best-practices/

https://medium.com/must-know-computer-science/system-design-caching-acbd1b02ca01

https://azure.microsoft.com/en-gb/resources/cloud-computing-dictionary/what-is-caching

https://newsletter.systemdesign.one/p/caching-patterns

Prachi Gupta的更多文章

Load balancing

2024年1月1日

Load balancing

Load balancing is the process of distributing network traffic across multiple servers. This ensures no single server…

Caching

Prachi Gupta

Engineering @ Adidas | Java, Microservices, SpringBoot, Rest API, Kubernetes

Caches in different layers

1. Client-side

2. DNS

3. Web Server

4. Application

5. Database

6. Content Distribution Network (CDN)

Types of caching Strategies

1. Write-Through cache

领英推荐

2. Read-Through cache

3. Write-Behind/ Write-Back cache

4. Cache-Aside Pattern

Prachi Gupta的更多文章

社区洞察

其他会员也浏览了

Using Redis with Rust

Basics of Systems Designs

Unveiling the Power of Caching: Redis Revolutionizes Data Performance

Understanding Redis Master-Replica with Sentinel: A Comprehensive Guide

MemcacheD vs Redis vs Aerospike — Which One to Choose for Your Application Design?

Building a New Execution Platform for Redis Clients

?? Scaling Databases and Caching: A Symbiotic Strategy for High Traffic Web Apps! ??

Scaling Up Series: Caching Techniques with Redis

Distributed Caching with Spring Cloud and Redis on AWS: A Practical Approach

Unlocking the Power of Caching in C#: Distributed, In-Memory, and Hybrid Approaches Explained