The Joys of Caching
(C) https://weknowyourdreams.com/joy.html

The Joys of Caching

Caching data can improve system performance. Let's take a look.

We have a data store that associates data records with immutable data keys:

k1: {h: 183}        

We call GET on datastore/data/k1, getting k1: {h: 183} in response.

The data record represents an object's height at 183 cm. The data record values can be updated (183), but the keys (k1 and h) cannot. For now, we won't be concerned with updates.

We have a single data store and multiple clients, 1 and 2.

Client 1 and 2 both read k1, and we have this:

Client 1 and 2, after reading k1

At this stage, it is essential to note that clients 1 and 2 have a copy of the data in the data store.

Performance

Crucially, for performance, if Client 1 or 2 subsequently requires access to the k1 data record, there is no need to contact the data store and incur the overhead of the call to GET datastore/data/k1. k1 is in the client's memory and can be accessed locally without a remote invocation.

The overhead of the client retrieving the local copy is 10s of milliseconds. A call to the remote Data Store will likely take 100s. Using a local copy, which is a cache, in this scenario improves performance by an order of magnitude.

The failure profile is also different. The cache call is purely local. The Data Store retrieval is a remote invocation, and the remote store may fail independently of the client.

Computation load has also been shared. Successfully retrieving from the cache means the Data Store is not contacted and has no work to do.

Some Numbers

Let's assume a local cache invocation takes ten milliseconds, a remote one takes 200, and we access k1 1,000 times. The total cost to access via the cache will be:

200 + 999 x 10 = 10,190 milliseconds        

There will be a single read of 200 that stores k1 in the client cache, and the subsequent 999 reads will be read from the cache, each one taking ten milliseconds.

Without the cache, the total cost is simply the 200-millisecond cost, 1000 times:

1000 * 200 = 200,000 milliseconds        

The cache reduces the data access overhead by an order of magnitude.

Each cache access saves 190 milliseconds, which is multiplied by 999.

What we have learned is that you need to keep your data close but your cache closer [3].

Calling via a RESTful Web Service

Calling the remote Data Store involves the client calling down through its call stack, the data being passed across the network, and up the Data Store call stack. k1 is retrieved and the Data Store passes the result back down its call stack, the data is passed across the network and up the client stack.

Data processing and checks must be performed within each layer of both software stacks to ensure that both the GET request and response are valid and secure. Request and response data must be translated into a format compatible with network transfer. If the amount of data being passed in either direction is significant, this translation can represent a major proportion of the remote call overhead.

Processing within these layers is the primary reason such an invocation will take 100s of milliseconds. If the client and Data Store are resident within the same server cabinet, the time the data spends on the network will be negligible in comparison to the software stack overhead. If the client and Data Store are separated by a continent, network delay will be more significant, and errors will be more likely, but the time to process the data through the client and remote software stack will be comparable.

The software stack overhead will be constant, and your architecture will dictate the network overhead.

A cache ensures the client's overhead is only the small data retrieval cost, not the expensive software stack processing and any network delay.

The Joys of Caching

From the client's perspective, a call to GET datastore/data/k1 returns k1: {h: 183}.

As there are no updates, it is not important where the data comes from. Whether k1 comes from the local in-memory copy or the data store makes no difference to the client.

The local in-memory copy in the client is cached data

A call to GET datastore/data/k1 might retrieve the local copy or initiate a call to the data store, subsequently retaining a client-resident copy.

The Cache

In the client, the call to GET datastore/data/k1 is a call to its in-memory cache. The cache then supplies the correct data directly from the cache or by retrieving it from the data store, retaining the copy for subsequent use.

The client is unaware that the data is being returned locally. The data store will be invoked once to load (or prime) the cache, but from then on, in our scenario, the data is hidden in the cache [1].

If the cache's functionality was switched off and all GETs were satisfied by the data store, the system would behave the same. Using a cache must not affect a system's behaviour, only its performance.

The GET is called on technology that exports a RESTful web-based interface, even though that interface implements a cache. The GET abstracts the presence of the cache. If some other cache implementation was substituted, the client code invoking the GET would not have to change. This is possible because the caching is hidden from the client code, which is possible because the cache does not affect system behaviour, only its performance.

Data Writes

Let's look at what happens when k1 in the data store is updated.

Before the update, we have this:

Before the Data Store is Updated

After the Data Store is updated:

After the Data Store is updated

The Data Store has the updated height (175), but the two caches retain the old value (183).

We now have a problem.

When client 1 or 2 reads k1, the answer (k1: {h: 183}) will come from the cache.

But this data is outdated; h is now 175, not 183.

This situation occurred because the cached data was an unsynchronized copy.

Possible Solutions

Given the above distributed system, three possible solutions present themselves:

  1. The cache checks with the Data Store before it returns anything
  2. The Data Store updates each cache when it receives an update
  3. You design your distributed application to tolerate inconsistency [2]

Cache Coherence

1 and 2 seek to achieve cache coherence, which is to synchronize the data in the cache with that in the Data Store, but 1. and 2. are not equivalent.

Let's assume your application is mostly reads, i.e., 90% of the time, GETs are performed on the Data Store, and only 10% of the time is data updated.

In this scenario, the cache checking with the Data Store before returning anything will be pointless 90% of the time.

Let's now assume that your application is mostly writes, so that 90% of the time, the Data Store is being updated, and only 10% of the time are caches read.

In this scenario, the cache is updated with correct data 90% of the time, but that data is only read from the cache 10% of the time.

We can see that choosing 1 or 2 as the "best" (performance enhancing) solution is highly dependent on your application's behaviour, which will change over time. Parts of your system will be characterised by mostly reads, other parts mostly writes, and the ratio of one to the other will fluctuate.

If your system is stable in its characteristics, e.g., is mostly read the vast majority of the time, then point 1 will be the solution for you. However, you must monitor this to know that this vital engineering assumption will remain true in the long term. If your system were to flip over to be mostly writes the majority of the time, the performance benefit you derive from the mostly-reads assumption will no longer be true. You may get such a flip after scaling your system so that assumptions with 100 customers are false with 10,000.

Tolerating the Inconsistency

As the cache is an unsynchronized copy of data in the Data Store, a period of inconsistency will occur after the Data Store is updated, as illustrated below.

Timeline of Data Inconsistency at the Caches when the Data Store is Updated

The Data Store initially has k1: {h: 183}, which is copied to the two caches and k1 at the Data Store is subsequently changed to k1: {h: 175}. From this point on, we have 183 in the two caches and 175 in the Data Store.

From this point to when the caches are updated with the changed data, the old data of k1: {h: 183} is available from the cache. This is the period of inconsistency (see the red arrows above).

Some applications can tolerate inconsistencies such as this. For example, when paying money into a UK bank account, the bank's mobile phone app may not show this transaction for some time, causing the bank balance to be incorrect.

However, some applications cannot tolerate such inconsistencies. If your medical record is being updated at a hospital, you do not want your family doctor to access an incomplete record and possibly make a decision without all of the information.

Summary

The benefits of a cache are maximised when you place your data as close to the consumer as possible. Removing the costly remote call saves about 190 milliseconds per call, which builds to a significant saving over 1,000 calls.

Hiding a cache's presence is important so that the invoking client is unaware of its existence. This allows for more system design flexibility.

Resources

[1] https://dictionary.cambridge.org/dictionary/french-english/cacher

[2] https://www.dhirubhai.net/pulse/understanding-inconsistency-suds-huw-evans-v2cre/

[3] https://www.youtube.com/watch?v=TqoMFLXsyAs&t=210s

要查看或添加评论,请登录

Huw Evans的更多文章

  • Debugging and the Scientific Method

    Debugging and the Scientific Method

    The scientific method helps you gain knowledge [1]. You make an observation and test it with an experiment that shows…

  • Debugging

    Debugging

    This is typically how I go about debugging a piece of code: What is wrong? Reproducing the error Finding the source of…

  • Understanding Inconsistency with SUDs

    Understanding Inconsistency with SUDs

    This article shows why inconsistency and latency are fundamental when building distributed systems and how PACELC and…

  • Software Engineering builds two Things

    Software Engineering builds two Things

    When we write software, we build two things. The software that provides the business solution.

  • Smaller teams are more reactive

    Smaller teams are more reactive

    On November 2 2022, I wrote an article on how I had recruited 12 new employees. This article covers what happened next.

  • Lazily filtering out non-Cats

    Lazily filtering out non-Cats

    In a previous article, I discussed how to safely generate a list of subtypes from an original list defined on a…

  • Cats are not Dogs

    Cats are not Dogs

    Who in life has not tried to do this? Trying to treat a list of Animal as a list of a subtype. This does not compile in…

  • Failure is a subtype of Success

    Failure is a subtype of Success

    This article considers how to cleanly handle both the failure and success paths in code, taking a look at how Java's…

  • Teaching Agile gives student a fish

    Teaching Agile gives student a fish

    Teaching a student or colleague agile software development or more generally agile project management gives them a…

  • Agile Manifesto #5

    Agile Manifesto #5

    The Agile Manifesto [1] states that the left-hand side of the following are preferred over the right-hand side:…

社区洞察

其他会员也浏览了