Rate-Limiting Simplified With Redis

Rate-Limiting Simplified With Redis

Image Source: Redwerk

Introduction

For engineers, witnessing the application they’ve worked on scale from serving hundreds to hundreds of thousands (or even millions) of users is a truly significant achievement. However, this kind of growth brings with it another challenge: the need to ensure that the system, with its finite resources, can effectively handle the increased load while maintaining optimal performance and reliability. It therefore becomes essential to regulate the flow of incoming requests to ensure that the system doesn’t throw its hands up when faced with sudden spikes in traffic. This is where rate limiting comes into play.

Simply put, rate limiting is a technique used to control the rate of incoming requests to a server. It restricts the number of requests a user can make within a specific period. This ultimately reduces the chances of server overload and protects it against abuse (such as bot attacks).

Redis has proven to be an excellent choice for custom rate-limiting functionality. Before we dive into Redis, however, let’s take a look at some common rate-limiting algorithms to better understand it.

Types of Rate Limiting Algorithms

Fixed Window

This is arguably the most primitive algorithm that can be implemented. Requests are tracked within discrete time windows or intervals (such as one second or minute). If the number of requests exceeds the predefined limit within the window, all subsequent requests are rejected until the window resets

Although this algorithm is simple to implement and predictable in behavior, it does not handle bursts of traffic efficiently. For example, in the diagram above (which shows a limit of four requests per minute), I could make four requests at the same time at the end of the first window, and four requests again (at the same time) at the beginning of the first window. Although I haven’t exceeded the rate limit, I have sent more than four requests within a very short period — any potentially overloaded the server.

In practice, very few use cases can utilize these kinds of discrete windows (and such strict rate-limiting rules altogether). Ideally, we want something more adaptable to incoming requests.

Additionally, the behavior of this algorithm at the window boundaries can be quite abrupt, which can result in inconsistent rate limiting.

Sliding Window

This algorithm can be thought of as a refinement of the fixed window algorithm. Instead of discrete windows of fixed intervals, we now use a window that slides or moves as time progresses. New requests are accepted or rejected based on the number of previous requests that lie within the sliding window.

The sliding window algorithm provides a lot more granular control, and the abruptness that we may see with the fixed window algorithm is also addressed. Although this algorithm is better at handling bursts of traffic, it may still allow for bursty behavior depending on the configuration. Not to mention the additional complexity and overhead that is introduced since we need to track requests continuously.

Token Bucket

The token bucket algorithm uses a theoretical “bucket” which is refilled with “tokens” on a fixed interval. This interval typically corresponds to the rate limit itself. For example, if the rate limit is four requests per minute, then tokens will be refilled at a rate of four tokens per minute too. Once the “capacity” of the bucket is exceeded, any additional tokens “overflow” and are discarded.

Each request uses or consumes one token. If no tokens are available in the bucket, then the request is either rejected or throttled until tokens become available.

The capacity of the bucket itself represents the burst limit. For example, if the bucket capacity is ten, then a user can “accumulate” ten tokens (during periods of low activity) and use them in bursts (during periods of high activity).

This algorithm excels at handling bursts of traffic effectively (and it provides a lot of granular control when doing so) and is quite predictable in its behavior. This makes it a good choice for use cases where bursty behavior can be tolerated, such as video streaming.

However, implementing it correctly and efficiently can be quite challenging — and failure to do so can often result in unreasonable request delays or rejections. Additionally, the token refill logic is tricky to optimize (and may have a lot of overhead), which means that preventing token wastage (which would negatively affect efficiency) is another problem that has to be solved.

Leaky Bucket

The leaky bucket uses the “bucket” metaphor once again, but it flips things around a bit. Each request is now considered a “token” which fills up the bucket. The catch here is that tokens “leak” (i.e., are processed) from the bucket at a constant rate. This helps smooth out incoming requests and essentially blocks bursty requests — incoming requests can be bursty, but the responses are always smoothed out. This makes leaky bucket rate limiting good for use cases where we need a continuous flow of data. Plus, it’s relatively simple to implement too (compared to something like the token bucket algorithm).

The drawbacks of using this algorithm include the fact that it does not handle bursts of traffic well. Additionally, if the bucket’s capacity or too small or the “leak rate” is too low, then we may run into a case of request starvation. Generally speaking, the leaky bucket algorithm lacks the flexibility to adapt to changing traffic patterns.

Why is Redis a Good Solution?

One word: in-memory.

For those unacquainted, Redis (which stands for Remote Dictionary Server) is an in-memory data structure store. In-memory simply means that data is stored in the system’s main memory (i.e., the RAM) rather than on secondary storage devices (which is what your regular databases use).

Redis is a good solution for rate limiting because:

  1. It’s fast: The in-memory nature of Redis enables extremely fast read and write operations. This means that it can handle the high throughput that is typically required for rate limiting (since incoming requests need to be processed swiftly).
  2. It’s flexible: Redis follows the key-value data model (just like any JSON object). This makes it incredibly flexible — which ultimately means that we’re not limited to a particular type of rate-limiting algorithm. Additionally, Redis provides several data structures (such as sets, sorted sets, and hashes) out of the box and is generally quite easy to work with.
  3. It supports atomic operations: Redis executes complex operations as a single transaction . This means that if even a single operation (in a transaction) fails, the entire transaction is reverted. This is crucial to enforce rate-limiting operations consistently and reliably.

Getting Our Hands Dirty

Let’s transport ourselves to JavaScript land and and try to build a simple rate limiter with Redis, using the token bucket algorithm. To start, let’s create a simple API route using Node.js and Express that fetches some dummy data.

Next, let’s create a simple script that simulates GET requests to the /api/data route. We’ll add a delay between requests and error handling as well.

If we run this script, the result (as expected) is predictable. Each request is simply processed one after the other.

Let’s now introduce Redis into the picture. A couple of prerequisites first, however:

  • You can follow the steps listed here to install Redis on your machine and run the redis-server executable.
  • The next step is to install node-redis, a Redis client for Node.js.

Let’s now create the Redis client and connect to the redis-server that is running locally.

Next, let’s add some middleware that allows or restricts access to our api/data API route based on our rate-limiting parameters. We’re also going to define these rate-limiting parameters within this middleware. To be specific, we need to:

  • Create a unique key (bucketKey) for each client that sends a request. We can use the incoming request’s IP address for now to keep things simple. In practice, a unique API key or access token is a better idea.
  • Define the rate at which tokens are refilled. For this example, let’s stick with a refill rate of ten tokens (numTokens) per minute (interval). We’re going to convert this interval to milliseconds (since the JavaScript Date object measures time in milliseconds too). Our refill rate (per millisecond) is, therefore, numTokens / interval.
  • Define the total capacity (bucketCapacity) of each bucket. Clients will not be able to accumulate more than bucketCapacity number of tokens.
  • Define the initial number of tokens (initialNumTokens) for new clients that connect for the first time and do not have a bucket yet.

With our rate limiting parameters defined, let’s retrieve our bucket from Redis using the unique bucketKey.

?? Since Redis follows the key-value data model, our bucket is simply going to be a (stringified) JSON object against the bucketKey.        

If the bucket does not exist, we’re simply going to create one and add initialNumTokens number of tokens to it. We’ll also maintain a property (lastUpdated) that lets us determine when the bucket was last updated.

If the bucket already exists, we need to apply our token refill logic first. To calculate the number of tokens that need to be added to the bucket, we need to:

  • Calculate the time that has elapsed (elapsedTime) since the last request (and subsequently the last time that we updated the bucket).
  • Calculate the number of tokens that need to be added to the bucket. We already have the refill rate, and we can multiply this by the elapsedTime to work this number out. We can then simply add this number to our bucket.
  • Add our token overflow logic. We’ll use Math.min to ensure that any excess tokens (above the predefined bucketCapacity) are discarded.

We’re also going to update the time that the bucket was last updated.

Finally, we need to determine whether the current request needs to be rate-limited. This part is simple: if the bucket is not empty, we’ll process the request (and update the values of the bucket on Redis too). Otherwise, we’ll send an HTTP 429 response back.

Before we can test our rate limiter with our script, we need to make sure that our /api/data API route is configured to use this middleware that we just created.

If we run our script now, we can see our rate limiter in action. Our first ten requests are successful because our bucket initially has ten tokens. As we start consuming these tokens, we’re eventually rate-limited by the time we hit our eleventh or twelfth request (and our requests start getting rejected). Our burst limit is capped at ten requests (and is represented by the bucketCapacity), and we can see that here.

And with that, we can call it a day. Simple, right? ??

Concluding Thoughts

Rate limiting plays a pivotal role in managing the flow of incoming traffic to our applications, thereby preventing server overload and protecting it against abuse. By understanding how different rate-limiting algorithms work and leveraging something like Redis to develop a simple yet effective rate limiter, engineers can ensure that their applications scale without compromising on performance and reliability.

Omar Rwemi

Software Developer

3 个月

The best article I could find on this topic. Clear and to the point. Thank you!

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了