Kubernetes at Big Companies!
Pavan Belagatti
GenAI Evangelist | Developer Advocate | 40k Newsletter Subscribers | Tech Content Creator | Empowering AI/ML/Data Startups ??
In this week's issue, we will see how some of the successful companies are employing Kubernetes.
Kubernetes at GOT:
Looks like Kubernetes did its job better than the GOT writers!!! Lol
The engineers started panicking as they knew the unpredictable traffic for the most anticipated Game of Thrones season seven premiere is going to be HUGE.
One of the challenges they found out was the under-utilization of the deployed resources. Node.js code tends only to use a single CPU core. AWS EC2 instances that had excellent networking capabilities tended to be based on dual-core CPUs.
As such, HBO was only using 50 percent of the deployed CPU capacity across its deployment. The ability to spin up new instances on EC2 wasn't quite as fast as what HBO needed.
HBO also found that in times of peak demand for Game of Thrones, it was also running out of available IP addresses to help deliver the content to viewers.
"We went from not running a single service inside of a container to hosting all of the Games of Thrones season 7 with Kubernetes," Illya Chekrygin, Senior Staff Engineer at HBO told the KubeCon audience.
At last, HBO chose Kubernetes among other alternatives, basically because of its vibrant and active community.
The keynote can be found here:?https://lnkd.in/gCAHyd9m
Credits: KubeCon 2017 & eWEEK
---------------------------------------------------------------------------------------------------------------
Kubernetes at Lyft
Lyft uses Spark for its machine learning use cases. Late 2018 and till today, Lyft is investing in their inhouse Spark and Kubernetes infrastructure.
Why did they pick Kubernetes?
The biggest thing about Kubernetes is being container-native, it helps them handle a lot of the dependencies very well. It is multi-tenant, so that works well for them in terms of their idea to run heterogeneous workloads on a single cluster. It has great support for operators, which makes it really extensible. Its features like declarative and immutable help them a lot.
Kubernetes can help solve the dependency and multi-version requirements using its containerized approach.
Spark on Kubernetes can scale significantly by using a multi-cluster compute mesh approach with proper resource isolation and scheduling techniques.
At Lyft Kubernetes has emerged as the next generation of cloud native infrastructure to support a wide variety of distributed workloads.
Understand a little more by watching this talk:?https://lnkd.in/enB_r9X
---------------------------------------------------------------------------------------------------------------
Kubernetes at Wise
At Wise, they chose to migrate their Apache Kafka clusters, previously running on Amazon Web Services (AWS) EC2 instances, into a multi-cluster Kubernetes setup.
A pivotal component of their platform is?Apache Kafka. They currently maintain 6 Kafka clusters, constituting 30+ brokers, processing billions of messages every day.
To improve the reliability and scalability of their clusters, they decided to move their Kafka clusters into their Kubernetes infrastructure.
Before they migrated, they managed their Kafka and backing Zookeeper clusters in AWS EC2 instances.?
? Why run Kafka in Kubernetes?
Their major motivations for moving Kafka to Kubernetes were:
? Leveraging Kubernetes to improve the scalability of their Kafka clusters. This will allow them to easily satisfy the growing usage of the platform by their product teams.
? Increase availability and reliability of the cluster using Kubernetes’ native self-healing functionality.
? Setting up better (and cloud-agnostic) automation around the administration of their Kafka clusters, through the use of liveness/readiness probes and operators.
? By moving to Kubernetes, which is managed by their central infrastructure team, they have reduced the time spent by the Realtime Data Platform team maintaining their Kafka infrastructure.
This gives them more time to focus on building a self-service platform and tools to improve the productivity of their teams and enable them to ship features to market faster.
领英推荐
Migrating a production Kafka cluster to Kubernetes is not without its challenges. Kubernetes (and a multi-cluster Kubernetes at that!) provides a very dynamic environment. Hosting a stateful service like Kafka required a lot of design and planning with co-operation from other infrastructure teams within Wise to achieve success.
Some of these challenges were discussed in the article.
Read the article here:?https://lnkd.in/drWugpW4
---------------------------------------------------------------------------------------------------------------
Kubernetes at Tinder
Tinder?? now exclusively runs on?Kubernetes!
They solved some interesting challenges. Read further????????????
Tinder's legacy architecture consisted of EC2 autoscaling groups, fronted by a load balancer per service, and scaling based on CPU usage. They used Puppet for bootstrapping node configuration and also setting up Prometheus nodes to monitor each of the services.
They decided to move to Kubernetes and started their migration starting in January 2018.
Kubernetes allowed them to drive Tinder Engineering toward containerization and low-touch operation through immutable deployment. Application build, deployment, & infrastructure would be defined as code. They were also looking to address the challenges of scale and stability.
When scaling became critical, they often suffered several minutes of waiting for new EC2 instances to come online. The idea of containers scheduling and serving traffic within seconds as opposed to minutes was appealing to them. During their migration in early 2019, they reached critical mass within their Kubernetes cluster & began encountering various challenges due to traffic volume, cluster size, and DNS.
They solved interesting challenges to migrate 200 services & run a Kubernetes cluster at scale. See how -?https://bit.ly/3nzz9e0
---------------------------------------------------------------------------------------------------------------
Kubernetes at Pinterest
With over 250 million monthly active users and serving over 10 billion recommendations every single day, that is huge. (The numbers might have changed now) As they knew these numbers are going to grow day by day, they began to realize the pain of scalability and performance issues.
Their initial strategy was to move their workload from EC2 instances to Docker containers; hence they first moved their services to Docker to free up engineering time spent on Puppet and to have an immutable infrastructure.
And then the next strategy was to move to Kubernetes:) Now they can take ideas from ideation to production in a matter of minutes whereas earlier they used to take hours or even days. They have cut down so much of overhead cost by utilizing Kubernetes and have removed a lot of manual work without making engineers worry about the underlying infrastructure.
Read the Pinterest Kubernetes story on their website ‘Pinterest Case Study’
---------------------------------------------------------------------------------------------------------------
Kubernetes at Pokemon Go
How was ‘Pokemon Go’ able to scale so efficiently & became so successful? The answer is Kubernetes. Pokemon Go was developed and published by Niantic Inc. 500+ million downloads and 20+ million daily active users.
Pokemon Go engineers never thought their user base would increase exponentially surpassing the expectations within a short time, they were not ready for it, and even the servers couldn’t handle this much traffic.
The?challenge
The horizontal scaling on one side but Pokemon Go also faced a severe challenge when it came to vertical scaling because of the real-time activity by millions of users worldwide. Niantic was not prepared for this.
The solution
The magic of containers. The application logic for the game ran on Google Container Engine (GKE) powered by the open source Kubernetes project. Niantic chose GKE for its ability to orchestrate their container cluster at planetary-scale, freeing its team to focus on deploying live changes for their players. In this way, Niantic used Google Cloud to turn Pokémon GO into a service for millions of players, continuously adapting and improving. This got them more time to concentrate on building the game’s application logic and new features rather than worrying about the scaling part.
Impressive, isn’t it? Read the complete case study shared on?Google Cloud.
---------------------------------------------------------------------------------------------------------------
Fasten your Kubernetes deployments with GitOps.
GitOps has quickly become a popular method to deploy software.
Would you like to try GitOps?
Get started with Harness GitOps? -?https://lnkd.in/g46BXzUK
Harness GitOps is the first complete CD solution with enterprise control, governance, visibility, and 100% interoperability with the open-source Argo CD GitOps approach.
Thanks:)
As you might expect, there was a bit more to our decisions around and challenges with Kubernetes at HBO for Game of Thrones. It wasn’t just our streaming microservices we had scaling challenges with, when fully scaled for a GOT Sunday night our observability stack, DNS used for service discovery, and challenges with early adoption of kubeneyes itself presented us with numerous technical hurdles - yet in the end we could pull it off thanks not so much to vigilance but to testing and preparation.
IT Product Manager | Senior Business Analyst | Piloting IT Products Development
3 年Shree Om Gupta
As an IT Professional, I bring a dynamic blend of expertise in AWS Cloud and extensive experience in managing Networks and ensuring ITIL best practices , passionate about collaborating with external customers.
3 年Thanks for sharing
[Immediate joiner] Cloud/DevOps/SRE Roles | MCT | LiFT Cloud Captain | SUSE Scholar '21 | OSS-ELC '20 Scholar | Former Fedora Contributor | ?? DevSecOps | Tech ???? | Community ?? | Public Speaker ???
3 年Wonderful compilation of case studies! Thanks for sharing Pavan Belagatti