登录查看更多内容

A Look at the Netflix Live Issues from the Love is Blind Reunion

Tom Ricardo

Cloud Leader and Evangelist

发布日期: 2023年4月17日

“What is wrong with TV?”

Normally when I get this question from my wife, my stomach goes into knots.?However taking a second, looking at the TV, I realized there are other people having a way worse night than me. Netflix’s second live event, The Love Is Blind – Live Reunion, failed to be presented live.

No alt text provided for this image — The message that dashed my wife's plans for the entire evening (or up until the recorded event was able to play)

At least 2.6 million people logged into Twitter to see how they were to be disappointed within a half an hour.?My wife sat on the couch, upset at first, then started riffing through memes coping with the loss of #loveisblind reunion.?Was this a problem with Netflix’s capacity, a Content Delivery Network or #cdn issue, or an issue broadcasting a live event from #AWS? ?

Let’s look at it.

Was it an issue broadcasting a live event from AWS?

No, AWS is the infrastructure behind a lot of live events broadcasting today.?One “prime” (don’t think that can even count as a pun) example is the NFL on Amazon Prime.?More live TV is broadcasted on AWS but looking at this example shows us a clear architecture for this from Amazon themselves.

AWS Elemental Architecture — Architectural Design for AWS Elemental

AWS has a tool called Elemental based off software company they bought for live broadcast over the internet.?You can host other solutions on AWS however Elemental has built in features for AWS accounts.?While Elemental encodes the video, the video is distributed by Amazon CloudFront, AWS’ Content Delivery Service or CDN. Cloudfront creates localized edge points for users to send requests to whether they are watching from Mobile, TV, or a PC Device. These are then directed locally to a datastore where the video is being stored in this case an S3 bucket. Simple put the users connect to Amazon #cloudfront which leads the session to the closest datastore where the end product of the Elemental broadcast is placed as it is being processed live.?An average of 11.3 million people remember that there is an NFL game on Thursday and tune in.?AWS, much like the other major Cloud Solution Providers, has more than enough scale to handle events like this.

Alan Wolk 2 年前

Why Netflix is the Battleground for 3.0

David Sable 2 年前

YouTube Falls Down, Amazon Gets Up

Alan Wolk 2 年前

So if it wasn’t AWS, could it have been the CDN.?

Netflix while being one of AWS’ biggest users decided in 2012 to build its own CDN called Open Connect, six years before #cloudfront debuted.?In 2014, Netflix started paying Comcast to stop throttling its usage not only leading to the future Net Neutrality debates but locking in Netflix’s investment.?That said Netflix has built one of the most impressive in-house CDNs in the world.?Netflix accounts for a little under 10% of the world’s global app traffic. Along with the five other largest brands (#microsoft, #google, #amazon, #facebook, and #apple ), each have extensive CDNs that easily could host an event like Love is Blind.?Also a session on #netflix was established by my TV app, instead it looked like a caching issue.

So if it wasn’t AWS or the CDN was it something with Netflix itself??

Netflix handles over 40 million concurrent users at any given moment, why would a live event cause them any problems??The issue might be in what makes Netflix work so well at scale, its micro segmentation.?Netflix uses its own api gateway built on AWS tools called #Zuul.?It allows Netflix to deal with millions of sessions on multiple types of devices like TV, Mobile Phones, and Web Sessions.?It is reliable open-source gateway build whose main weakness is in a scaling event with overloaded instances. When presented with a a scaling event, Zuul could start throttling and refusing connections with preference to towards older sessions.?Normally this makes sense, why ruin a session where someone is in the middle of movie or show in favor of new requests? A new session could pointing to a bug or invalid configuration property for example loading a show which isn’t configure correctly causing multiple bad sessions all at once (foreshadowing).?

Of course, this is an extreme simplification however it starts painting a picture of what could have happened if say you tried to use this system in front changing an extremely large number of sessions from one datastore (a holding screen) to another (a live broadcast) at the same time.

If you had an issue like that, the new sessions would throttled as the internal service tried to recover, in fact as this is going on until the service recovers, internal retries would be disabled.?This would lead to people either being stuck in a loading screen or bounced back to the holding screen if Zuul was trying to move the datastore in a canary release.?People would get frustrated, start a new session, sending even more requests to be throttled or denied.?Because while Zuul is built to handle millions of sessions for thousands of different assets, it is not built to deal with millions of requests to a single asset at the same time if there is an error with the asset.?The issue could cause a cascading event. This is just an educated guess at what could have happened however these are the kind of issues you can see when deploying a new application like #netflixlive in the wild.

But Tom, how do you test this??Well Netflix did with a smaller event, however they may not have had a simplified alternative if something went wrong.?Or if they did, however, they were always just a little bit away from solving the problem as it just grew slightly out of reach of an extremely talented team.?Everyone in IT has had this happen.?The best designs fail at the worst moment, and major outages affect the customer experience.?This is why it is always helpful to have an outside experience, whether that be trusted colleagues [or a good message board] you can reach to while planning.?Or you can work with solution architects from AWS or look to leverage outside consultants.?It is important to build a team of resources that you trust to execute your mission. If you are looking for please feel to reach out to us at Oxford Global Resources

A Look at the Netflix Live Issues from the Love is Blind Reunion

Tom Ricardo

Cloud Leader and Evangelist

“What is wrong with TV?”

Was it an issue broadcasting a live event from AWS?

领英推荐

So if it wasn’t AWS, could it have been the CDN.?

So if it wasn’t AWS or the CDN was it something with Netflix itself??

更多精彩文章

社区洞察

其他会员也浏览了

StreamWars

Is YouTube TV Using The NFL As A Trojan Horse For Subs?

TV[R]EV Week In Review: YouTube’s TV Viewership Is Up 90%; More Brand Safety Tsouris For Facebook

Netflix Think They Don’t Have a YouTube Problem… But They Actually Do

TV[R]EV Week In Review: YouTube TV Launches; Netflix's Shows Still Sort of Unknown

Alan Wolk’s New Book, Predictions for TV

Mostly Quiet on the Netflix Front… Thankfully

TVREV Week In Review: Everything We've Told You About Apple TV Is True; Twitter Tries Yet Another Hail Mary.

The #StreamingWars So Far

Netflix's Password Sharing Showdown: Game-Changer or Customer Catastrophe?

“What is wrong with TV?”

Was it an issue broadcasting a live event from AWS?

领英推荐

So if it wasn’t AWS, could it have been the CDN.?

So if it wasn’t AWS or the CDN was it something with Netflix itself??

Using AWS CloudWatch Internet Monitor

2024年9月26日

The Precautionary Tale of CrowdStrike: Why QA matters in Cybersecurity

2024年7月23日

Discussing TCO in 2024

2024年6月10日

Re:Invent From Home | S3 Express One Zone: Need to Go Fast

2023年11月30日

ReInvent from Home - Playing with PartyRock

2023年11月27日

What happened with MGM Casinos?

2023年9月24日

Why SAP is Leading Customers to the Cloud

2023年9月11日

Looking at avoiding IPv4 charges on AWS

2023年8月2日

Before we talk about AI, let’s talk about your data

2023年7月20日

Third Time Around -A Cloud Journey through AWS SA Pro Exams

2022年3月17日

社区洞察

其他会员也浏览了

StreamWars

Is YouTube TV Using The NFL As A Trojan Horse For Subs?

TV[R]EV Week In Review: YouTube’s TV Viewership Is Up 90%; More Brand Safety Tsouris For Facebook

Netflix Think They Don’t Have a YouTube Problem… But They Actually Do

TV[R]EV Week In Review: YouTube TV Launches; Netflix's Shows Still Sort of Unknown

Alan Wolk’s New Book, Predictions for TV

Mostly Quiet on the Netflix Front… Thankfully

TVREV Week In Review: Everything We've Told You About Apple TV Is True; Twitter Tries Yet Another Hail Mary.

The #StreamingWars So Far

Netflix's Password Sharing Showdown: Game-Changer or Customer Catastrophe?