Token Bucket Algorithm in a Distributed AWS Lambda Environment
Gabriel L.
Software Engineer | Solution Architecture | Data integration Solutions | AWS Cloud native solutions with Python | Microservices architecture | Clean Code | Rest APIs | Api Gateway Governance | SOLID | TDD | YAGNI | DRY
The token bucket algorithm is crucial for network traffic management, as it controls the rate of packet transmission over a network based on token availability. Each token represents the right to transmit a certain volume of data. Tokens are added to a bucket at a consistent rate, never surpassing the bucket's limit. If the bucket is full, incoming packets must either wait for new tokens to be available or be dropped.
Now, applying this concept to a distributed environment with AWS Lambda, we can design a solution to integrate with our vendors while keeping the throughput below the stipulated TPS (Transactions Per Second), offering a low-cost and scalable solution.
Distributed Token Bucket Model
In this model, an AWS Lambda function named "vendor-outbound-gateway-lambda-service" is responsible for making requests to the vendor's REST APIs.
Infrastructure Components:
Application Components:
All communication between the system and the vendor occurs through this Lambda service. Messages are passed through the SQS queue, and responses are handled via the SNS topic. Large payloads are stored in S3, and only the object keys are passed through the channels.
The vendor imposes a rate limit of 10 TPS. To stay within this limit, the token bucket algorithm can be employed where each Lambda instance represents a token. Each instance has a fixed runtime (AWS Lambda timeout), and tokens (Lambda instances) are never generated beyond the system's capacity (concurrency). As instances finish processing, new tokens (Lambda instances) are made available.
Configuring the Token Bucket
Each token can process up to 10 requests, which corresponds to an SQS batch size of 10. For example, with an AWS Lambda concurrency setting of 1 (i.e., 1 token), each token can handle 10 requests at a time.
Increasing the concurrency to 2 tokens allows for 20 requests to be processed, and so on. However, this model assumes that each Lambda instance completes execution in exactly 1 second, enabling the processing of 10 requests per second (per instance).
But real-world factors like latency complicate this. For example, if each request takes 5 seconds plus additional overhead for Lambda cold starts, deserialization, and response processing, the total processing time could reach 10 seconds. In this case, 10 requests every 10 seconds means that the throughput is effectively 1 TPS per token. To achieve 10 TPS, you would need 10 tokens.
领英推荐
Calculating Visibility Timeout and Token Availability
In this scenario, the ideal visibility timeout for the SQS queue would be at least 15 seconds. This ensures that if a batch of messages is polled but no token is available, the messages will return to the queue after 15 seconds, allowing time for a new token to be released. Messages that are in flight will be processed within this window, ensuring that no duplicate messages are sent before their current processing is complete.
Throughput Measurement
Throughput is calculated as:
Throughput = {requests per token} {number of tokens available} {time until next token is available}
For example, if each token processes 10 requests in 5 seconds, the throughput would be: 10(requests) * 5 (tokens) / 10s (latency) = 5 TPS
Problems with this Solution
By refining this approach with dynamic rate limiting, backoff strategies, and appropriate timeout configurations, you can ensure smooth and efficient integration with the vendor while staying within their rate limits.
Itaú Unibanco | IT Analyst | C# | .Net Core | React | Node | Typescript | Python | AWS | Serverless
4 个月Top demais ! Você é o cara !