The Joys of Caching
Caching data can improve system performance. Let's take a look.
We have a data store that associates data records with immutable data keys:
k1: {h: 183}
We call GET on datastore/data/k1, getting k1: {h: 183} in response.
The data record represents an object's height at 183 cm. The data record values can be updated (183), but the keys (k1 and h) cannot. For now, we won't be concerned with updates.
We have a single data store and multiple clients, 1 and 2.
Client 1 and 2 both read k1, and we have this:
At this stage, it is essential to note that clients 1 and 2 have a copy of the data in the data store.
Performance
Crucially, for performance, if Client 1 or 2 subsequently requires access to the k1 data record, there is no need to contact the data store and incur the overhead of the call to GET datastore/data/k1. k1 is in the client's memory and can be accessed locally without a remote invocation.
The overhead of the client retrieving the local copy is 10s of milliseconds. A call to the remote Data Store will likely take 100s. Using a local copy, which is a cache, in this scenario improves performance by an order of magnitude.
The failure profile is also different. The cache call is purely local. The Data Store retrieval is a remote invocation, and the remote store may fail independently of the client.
Computation load has also been shared. Successfully retrieving from the cache means the Data Store is not contacted and has no work to do.
Some Numbers
Let's assume a local cache invocation takes ten milliseconds, a remote one takes 200, and we access k1 1,000 times. The total cost to access via the cache will be:
200 + 999 x 10 = 10,190 milliseconds
There will be a single read of 200 that stores k1 in the client cache, and the subsequent 999 reads will be read from the cache, each one taking ten milliseconds.
Without the cache, the total cost is simply the 200-millisecond cost, 1000 times:
1000 * 200 = 200,000 milliseconds
The cache reduces the data access overhead by an order of magnitude.
Each cache access saves 190 milliseconds, which is multiplied by 999.
What we have learned is that you need to keep your data close but your cache closer [3].
Calling via a RESTful Web Service
Calling the remote Data Store involves the client calling down through its call stack, the data being passed across the network, and up the Data Store call stack. k1 is retrieved and the Data Store passes the result back down its call stack, the data is passed across the network and up the client stack.
Data processing and checks must be performed within each layer of both software stacks to ensure that both the GET request and response are valid and secure. Request and response data must be translated into a format compatible with network transfer. If the amount of data being passed in either direction is significant, this translation can represent a major proportion of the remote call overhead.
Processing within these layers is the primary reason such an invocation will take 100s of milliseconds. If the client and Data Store are resident within the same server cabinet, the time the data spends on the network will be negligible in comparison to the software stack overhead. If the client and Data Store are separated by a continent, network delay will be more significant, and errors will be more likely, but the time to process the data through the client and remote software stack will be comparable.
The software stack overhead will be constant, and your architecture will dictate the network overhead.
A cache ensures the client's overhead is only the small data retrieval cost, not the expensive software stack processing and any network delay.
The Joys of Caching
From the client's perspective, a call to GET datastore/data/k1 returns k1: {h: 183}.
As there are no updates, it is not important where the data comes from. Whether k1 comes from the local in-memory copy or the data store makes no difference to the client.
A call to GET datastore/data/k1 might retrieve the local copy or initiate a call to the data store, subsequently retaining a client-resident copy.
The Cache
In the client, the call to GET datastore/data/k1 is a call to its in-memory cache. The cache then supplies the correct data directly from the cache or by retrieving it from the data store, retaining the copy for subsequent use.
The client is unaware that the data is being returned locally. The data store will be invoked once to load (or prime) the cache, but from then on, in our scenario, the data is hidden in the cache [1].
领英推荐
If the cache's functionality was switched off and all GETs were satisfied by the data store, the system would behave the same. Using a cache must not affect a system's behaviour, only its performance.
The GET is called on technology that exports a RESTful web-based interface, even though that interface implements a cache. The GET abstracts the presence of the cache. If some other cache implementation was substituted, the client code invoking the GET would not have to change. This is possible because the caching is hidden from the client code, which is possible because the cache does not affect system behaviour, only its performance.
Data Writes
Let's look at what happens when k1 in the data store is updated.
Before the update, we have this:
After the Data Store is updated:
The Data Store has the updated height (175), but the two caches retain the old value (183).
We now have a problem.
When client 1 or 2 reads k1, the answer (k1: {h: 183}) will come from the cache.
But this data is outdated; h is now 175, not 183.
This situation occurred because the cached data was an unsynchronized copy.
Possible Solutions
Given the above distributed system, three possible solutions present themselves:
Cache Coherence
1 and 2 seek to achieve cache coherence, which is to synchronize the data in the cache with that in the Data Store, but 1. and 2. are not equivalent.
Let's assume your application is mostly reads, i.e., 90% of the time, GETs are performed on the Data Store, and only 10% of the time is data updated.
In this scenario, the cache checking with the Data Store before returning anything will be pointless 90% of the time.
Let's now assume that your application is mostly writes, so that 90% of the time, the Data Store is being updated, and only 10% of the time are caches read.
In this scenario, the cache is updated with correct data 90% of the time, but that data is only read from the cache 10% of the time.
We can see that choosing 1 or 2 as the "best" (performance enhancing) solution is highly dependent on your application's behaviour, which will change over time. Parts of your system will be characterised by mostly reads, other parts mostly writes, and the ratio of one to the other will fluctuate.
If your system is stable in its characteristics, e.g., is mostly read the vast majority of the time, then point 1 will be the solution for you. However, you must monitor this to know that this vital engineering assumption will remain true in the long term. If your system were to flip over to be mostly writes the majority of the time, the performance benefit you derive from the mostly-reads assumption will no longer be true. You may get such a flip after scaling your system so that assumptions with 100 customers are false with 10,000.
Tolerating the Inconsistency
As the cache is an unsynchronized copy of data in the Data Store, a period of inconsistency will occur after the Data Store is updated, as illustrated below.
The Data Store initially has k1: {h: 183}, which is copied to the two caches and k1 at the Data Store is subsequently changed to k1: {h: 175}. From this point on, we have 183 in the two caches and 175 in the Data Store.
From this point to when the caches are updated with the changed data, the old data of k1: {h: 183} is available from the cache. This is the period of inconsistency (see the red arrows above).
Some applications can tolerate inconsistencies such as this. For example, when paying money into a UK bank account, the bank's mobile phone app may not show this transaction for some time, causing the bank balance to be incorrect.
However, some applications cannot tolerate such inconsistencies. If your medical record is being updated at a hospital, you do not want your family doctor to access an incomplete record and possibly make a decision without all of the information.
Summary
The benefits of a cache are maximised when you place your data as close to the consumer as possible. Removing the costly remote call saves about 190 milliseconds per call, which builds to a significant saving over 1,000 calls.
Hiding a cache's presence is important so that the invoking client is unaware of its existence. This allows for more system design flexibility.