From Hops to Monoliths: Crafting High-Performance Architecture in AdTech
Shreeniwas V Iyer
Harnessing Talent, Delivering Impact | Engineering Leader | Ex-CTO | Startup Founder
In my previous posts, Inside the World of Trillions: The Real-Time Ad Auctions Powering the Internet and Optimizing Networks for Billions: Scaling Efficiency and Speed in AdTech, I shared how, at Quantcast, we use a series of network optimizations to achieve scalability at low cost and discussed some of the trade-offs we make along the way. Here’s a recap: we process approximately 250 billion transactions daily, with responses required within 40-50 milliseconds. Of these, 220-230 billion are bidding endpoint requests, where we either bid or opt out. In this architecture, we prioritize scale, low latency, and low cost over absolute completeness.
Simplifying the Bidding Stack
To streamline our bidding stack, we reduce communication hops to the bare minimum. Our entire bidding infrastructure consists of only three components: a Layer 7 Load Balancer, a component called Mux, and another component called Bidder. Additionally, we use Aerospike as a distributed data store to quickly access critical information during lookups.
Given these trade-offs, we intentionally avoid a microservices architecture. We’ve learned the hard way that multiple service hops for discrete business logic tasks are not worth the cost in our low-latency environment. A few years ago, we attempted to add a third hop in our stack’s straight line, which led to costly issues during trial phases. Maintaining a small number of monoliths has proven much more effective for us.
Why We Use Two Compute Systems: Mux and Bidder
So, why do we have two compute systems—Mux and Bidder? This setup is no accident and isn’t driven by legacy reasons. As an AI-powered company, every bid we make is driven by an AI model. These models are trained in the background but are tested live. By separating Mux and Bidder, we can run parallel bidding processes—one with production code and one with experimental code—to directly compare outcomes in real-world conditions.
Dividing Responsibilities: Mux and Bidder
Mux handles everything that doesn’t require a data lookup, eliminating a category of bids that we would bid on anyway. Once it gathers all preliminary data, it sends requests to the Bidder (both production and experimental versions), collects results, and consolidates them for a final decision.
领英推荐
Data Storage Choices: Minimalism in the Critical Path
If you noticed, our architecture barely mentions traditional databases, files, or other data stores. That’s because we don’t use them in the critical path. Instead, any necessary configuration is pre-loaded into memory and refreshed through scheduled updates every few minutes or hours. We run machines with substantial memory and use highly structured data structures tailored to each unique requirement, optimizing for speed and cost.
Logging and Batch Processing
We also avoid writing data in the critical path. Each bid generates a log entry, but even that’s processed in the background and saved in large batches into Parquet files before being centralized. Any updates or insights from this data typically take place 1-4 hours later.
Real-Time Input for Control Systems
The one system where real-time input is essential is our control system, which needs immediate data on supply volumes for each campaign. To balance memory and processing demands, we use a reservoir sampling mode, allowing us to push out updates in 30-second batches.
Observability: Real-Time Monitoring and Optimization
Observability is crucial to our architecture. We log extensive time-series data into observability systems like Datadog or Prometheus, allowing us to track system health, detect errors, and identify optimization opportunities—all in the background.
Our architecture is purpose-built from the ground up to optimize for latency, scalability, and cost. What are some techniques you use to achieve similar results?