How Disney+ Scaled to 150 Million Subscribers - Tech Edition
Disney+ Architecture
Here is a simple version of how the Disney+ architecture works based on the user's action.
1. Let's Watch StarWars
Disney+ runs infrastructure across many regions, because it helps with failover when infrastructure in a region fails, and routes the user to the nearest one.
They also serve content from the content delivery network (CDN) with low latency, which improves performance by caching content closer to users. In contrast, videos get stored in an object storage to lower costs.
Kinesis Data Streams
Disney Plus uses the Amazon Kinesis Data Streams (KDS) to stream the data. KDS is a real-time data processing service offered by Amazon Web Services that can capture large amounts of data constantly flowing in, process it in real-time, and prepare it for further analysis.
Here's a breakdown of its key features:
As the user watches a movie, the platform streams the video timestamp along with the video data. This timestamp is stored in a DynamoDB table, a key-value database that excels at storing frequently accessed data and ensures global accessibility. This feature automatically replicates data across multiple regions, guaranteeing high availability.
So, when the user resumes playback, the platform retrieves the stored timestamp, allowing them to pick up right where they left off.
2. Explore Similar Titles
If the user seeks a more specific movie, the system leverages a document data store. This document data store is chosen for its flexible schema, accommodating both movie metadata and user reviews.
To improve response times for popular searches, the system caches queries in a key-value database. While DynamoDB isn't ideal for caching, it was chosen for its existing presence in the architecture and ease of use during launch. Scaling a dedicated in-memory cache would have added complexity at launch.
领英推荐
3. Any Other Recommendations?
The user lands on the homepage and notices the recommended titles.
These recommendations are powered by machine learning, which analyzes factors like location and viewing habits to predict the user's interests, learning from data patterns and making informed suggestions.
The recommendations are delivered as a continuous stream of data, which undergoes additional processing to ensure accuracy. Finally, the results are stored in a specialized database that allows for quick retrieval.
4. Time For Bed - Add Titles To Watchlist
The watchlist is stored in a key-value database for efficient retrieval.
To ensure a consistent watchlist experience across regions, Disney+ leverages DynamoDB's global tables. This guarantees automatic synchronization, preventing outdated watchlists during failover.
DynamoDB Partitioning
To handle potential traffic spikes, Disney+ proactively pre-partitioned the DynamoDB tables before launch. This avoids performance throttling that can occur during automatic partitioning. Additionally, autoscaling ensures the database can adapt to growing user demand.
Recognizing that popular movies will see higher read traffic, Disney+ implemented data replication across multiple database partitions. This distributes the load and prevents "hot shard" issues, where a single partition becomes overloaded with read requests.
Disney+ has grown to around 150 million users and remains one of the biggest video streaming services in the market.
References