Full Power of Vector Unleashed | In-Depth Look at Adaptive Request Concurrency Mechanism
Introduction of Vector
Vector, a high-performance end-to-end observability data pipeline written in Rust, frequently encounters scenarios where upstream and downstream processing rates don't match. Often, Vector's actual throughput exceeds the processing capacity of downstream databases like Elasticsearch or ClickHouse.
To prevent overloading downstream services, Vector typically employs a rate-limiting mechanism. For example, in the HTTP Sink, users can configure request.rate_limit_duration_secs and request.rate_limit_num to control the rate. Below is an example where Vector limits the downstream rate to 10 ops(operations per second):
sinks:
my-sink:
request:
rate_limit_duration_secs: 1
rate_limit_num: 10
However, static rate limiting isn't a perfect solution, as it struggles to adapt to dynamic conditions.
Why Adaptive Request Concurrency is Needed
There are two major challenges with static rate limiting:
The optimal rate is always constrained by factors like the number of Vector instances, the capacity of downstream services, and the volume of data being sent. Static rate limiting can't dynamically adjust to these ever-changing factors.
Limitations of Static Rate Limiting
The following diagram illustrates the limitations of static rate limiting:
Adaptive Request Concurrency: Overcoming the Static Rate Limiting Bottleneck
To address the limitations of static rate limiting, the Vector team designed an Adaptive Request Concurrency mechanism (ARC), inspired by the experiences shared in Netflix's blog post, Performance Under Load. ARC dynamically adjusts the data flow rate based on current system statistics, maximizing system throughput and preventing the resource wastage and overload issues associated with static rate limiting.
领英推荐
How ARC Works
When ARC is enabled, Vector monitors two key metrics to dynamically adjust the flow control strategy:
Using these metrics, Vector employs the AIMD (Additive Increase Multiplicative Decrease) algorithm to adjust the traffic:
The following diagram illustrates how Vector dynamically adjusts traffic in ARC mode based on system load:
AIMD Algorithm and TCP Congestion Control
The AIMD algorithm is derived from TCP congestion control and is widely used in network protocols to effectively balance bandwidth utilization and network stability. In Vector, AIMD helps maintain high throughput while preventing system crashes caused by overload, ensuring system stability.
ARC is implemented as a Tower Layer in Vector. For those interested, the implementation details can be explored in this file.
Configuring ARC
ARC is currently supported by all HTTP data-receiving sinks. Taking ClickHouse Sink as an example, users can enable Adaptive Request Concurrency with the following configuration:
sinks:
clickhouse_internal:
type: clickhouse
inputs:
- log_stream_1
- log_stream_2
host: https://clickhouse-prod:8123
table: prod-log-data
request:
concurrency: adaptive
With this configuration, Vector will use Adaptive Request Concurrency to write data to ClickHouse. After enabling ARC, Vector will automatically adjust the data transmission rate based on real-time system conditions, ensuring that the system remains stable and efficient, even under varying loads.
Adaptive Request Concurrency (ARC) is an effective solution
Adaptive Request Concurrency (ARC) is an effective solution for handling mismatched processing rates between upstream and downstream services in modern complex systems. By monitoring RTT and HTTP response codes in real-time, Vector can dynamically adjust the request rate, maximizing resource utilization and ensuring the health of downstream databases. Compared to static rate limiting, ARC provides a more flexible, intelligent approach that improves the overall efficiency and reliability of the system while maintaining stability.