登录查看更多内容

Cloud Native and AI: Better Together

Arun Gupta

发布日期: 2024年12月13日

I enjoyed keynoting the first-ever KubeCon in India. Having grown up in Delhi, it was like a homecoming experience. I enjoyed engaging with passionate attendees and a strong eagerness to learn and contribute. Even though I moved out of Delhi 25+ years ago, I also certainly enjoyed sharing travel tips with others who came from outside Delhi.

My keynote topic was “Cloud Native and AI: Better Together.” This article summarizes the overall message.

Over the last decade, Cloud Native has adapted to support stateless, stateful, and serverless workloads. The platform's several traits make it suitable for running a wide variety of workloads.

Scalability: easy scaling of compute resources based on the need
Cost-efficient: pay-as-you-go keeps TCO down
Containerization: the same package works across different compute environments
Harmony: between dev, test, staging, and production environment
High availability: minimizes downtime
Microservices: breaks down into smaller components and scale independently based upon compute/memory/IO requirements

These are the exact reasons that Cloud Native is an ideal platform for running AI workloads as well. Over the last ten years, a large knowledge base on operating a Cloud Native platform has been built, including running mission-critical systems at a large scale, design patterns and anti-patterns, managing services by hyperscalers or on data centers, skilled workers, and much more. Leveraging that knowledge to run AI workloads is just a logical and evolutionary step.

Let's examine cloud-native AI (CNAI) from three perspectives: Kubernetes, ML Engineering, and App Development.

Kubernetes

What’s done in k8s to make it AI-friendly??

Up until 1.26, K8s could only handle integer-countable resources such as RAM and CPU. That release introduced a new API called Dynamic Resource Allocation (DRA). This API provides a much richer interface for requesting and configuring generic resources, such as GPUs. DRA is a generalization of the persistent volumes API for generic resources. It allows hardware vendors to extend Kubernetes by writing DRA drivers, which are responsible for the hardware and the user-facing interface.?

Existing device plugins limit users to assigning a device to one container. DRA enables GPU devices to be shared across different containers and pods so you can flexibly choose how they’re used. DRA also defines how device resources look from the node to the runtime. This makes K8s GPU-friendly and thus easy to use for your training and inferencing workloads.

The API was introduced as?alpha?in the 1.26 release and just released as?beta?in 1.32.

ML Engineer

You are an ML engineer and have heard good things about Cloud Native. How do you get started?

Kubeflow?is an ecosystem of k8s-based components for each stage of the AI/ML lifecycle with support for best-in-class open-source tools and frameworks. It makes AI/ML simple, portable, and scalable.

领英推荐

General availability of Inf2 instances made possible…

AWS Careers 1 年前

AWS Goodies - June 10, 2024

Jeff Barr 9 个月前

Deep Dive into AWS Generative AI Services: A Layered…

Dr. Rabi Prasad Padhy 11 个月前

Kubeflow provides tools for data preparation (Spark operator), model training (training operator), optimization (Katib), and serving (KServe), model metadata (Model Registry), workflows (Pipelines), and much more. It uses Kubernetes as the base compute layer, which allows it to be run on any hyperscaler, data center, or even your laptop.

If you are an ML engineer and want to leverage the benefits of the Cloud Native platform, Kubeflow is your framework of choice.

In addition, you should also look at?Kueue,?which provides a job queuing system for HPC and AI/ML workloads.

App Developer

If you are an application developer, then you’re looking for an opinionated stack that allows you to integrate GenAI into your applications. You need a blueprint with a pre-configured set of LLM/SLMs, vector database, and other necessary components such as embedding, retriever, and re-ranker. More often, you don’t even know what components are required. You need a Helm chart, with all the required components, that deploys into your existing k8s cluster with a single click. This is where Open Platform for Enterprise AI (OPEA) fits in.

It provides 30+ component-level microservices such as LLM, vector database, and all the necessary components for GenAI. It composes these microservices to create GenAI blueprints, or mega services, that can be deployed in any k8s cluster. For example, ChatQnA provides a chatbot that allows you to integrate with your enterprise data using Retrieval Augmented Generation (RAG). It comes with TGI as text-generation LLM, Redis vector database, TEI embedding, and other components. The project has 20+ GenAI blueprints such as AudioQnA, VideoQnA, Agentic workflow, Code generator, and much more.

The ChatQnA example is also available on?the AWS marketplace. It runs on top of Amazon EKS and uses OpenSearch as the vector database. It is also integrated with Amazon Bedrock, which allows you to integrate a wide range of LLMs.

Summary?

If you are passionate about how Cloud Native is going to support AI workloads, I recommend joining the AI Working Group. We have released the Cloud Native AI Whitepaper already. Now, we are working on three new whitepapers – Cloud Native AI Scheduling, Cloud Native AI Security, and Cloud Native AI Sustainability. In addition, we are also working on validating OPEA samples on ARM architecture. We can provide infrastructure and cloud credits for you to get started on that; we just need people who are willing to do the work.

Join the CNCF slack and say hello in the #wg-artificial-intelligence.

Let’s make Cloud Native the best platform for AI, together!

ps: The graphic is inspired by Cassandra Chin 's Phippy and AI book.

Ramanuj Dad

3 个月

Arun Gupta It was great meeting you at the exhibition hall during KubeCon and having the chance to sit with you and discuss the OPEA Project. Learning about how OPEA’s blueprint-based capabilities was truly insightful. I also really enjoyed your keynote on Day 2—your presentation gave me a lot to think about, and I’m excited to give it a shot using the Helm chart. Meeting a Java Champion and Docker Captain like yourself was truly a memorable experience.

2 次回应

Kittur Ganesh

Technical Program Manager [oneAPI / AI Centers of Excellence ! Accelerated Computing ! oneAPI / AI Developer Engagement & Open Source Ecosystem Enablement]

3 个月

A very informative blog, Arun Gupta. I enjoyed all my visits to Delhi; and the lovely local food there too ??

1 次回应

查看更多评论

要查看或添加评论，请登录

Arun Gupta的更多文章

A Cruise Trip to Antarctica

2025年3月11日

A Cruise Trip to Antarctica

I completed a 2-weeks trip to Antarctica. It was cold, stunningly beautiful, and very unique in a lot of ways.

34 条评论
Announcing My Latest Book: Fostering Open Source Culture

2024年11月11日

Announcing My Latest Book: Fostering Open Source Culture

I am very excited to announce my latest and 7th book - Fostering Open Source Culture. Open source is the norm, has won,…

44 条评论
TEDAI Hackathon 2024 Wrapup

2024年10月28日

TEDAI Hackathon 2024 Wrapup

TEDAI San Francisco hosted their second annual hackathon last weekend. This 30-hour hackathon brought developers from…

4 条评论
Ten Tips for a Healthy Mind at Work

2024年10月10日

Ten Tips for a Healthy Mind at Work

Today, October 10th is World Mental Health Day! Credits: https://wfmh.global/ This day was created by the World…
GenAI RAG Chatbot on AWS, Microsoft Azure, and Google Cloud

2024年9月15日

GenAI RAG Chatbot on AWS, Microsoft Azure, and Google Cloud

Want to get started with a simple GenAI chatbot using Retrieval Augmented Generation (RAG) on hyperscalers? Open GenAI…

8 条评论
Mount Kilimanjaro, Serengeti, and Workplace

2024年6月10日

Mount Kilimanjaro, Serengeti, and Workplace

Mount Kilimanjaro in Tanzania is the highest free-standing mountain in the world and the tallest in Africa at 5,895…

28 条评论
Mental Health Awareness Month - May 2024

2024年5月1日

Mental Health Awareness Month - May 2024

In a world where productivity often takes precedence, it's essential to remember that mental health matters just as…

2 条评论
La Vie En KubeCon Paris

2024年3月24日

La Vie En KubeCon Paris

KubeCon EU 2024 concluded last week and this is a time for reflection. This was the 20th KubeCon on the 10th…

2 条评论
Developer webinars for 7th-gen AWS instances

2024年2月9日

Developer webinars for 7th-gen AWS instances

Seventh-Generation General Purpose Amazon EC2 Instances (M7i, M7i-Flex, M7i-metal, C7i, R7i, and R7iz) are generally…

1 条评论
'Mo' Power: Shaping Men's Health

2023年12月1日

'Mo' Power: Shaping Men's Health

Did you know growing a "mo" (short for mustache) is a symbol for men's health? Today marks the end of Movember. It is…

11 条评论

See all articles

Cloud Native and AI: Better Together

Arun Gupta

Kubernetes

ML Engineer

领英推荐

App Developer

Summary?

Arun Gupta的更多文章

社区洞察

其他会员也浏览了

Serverless MLflow Tracking in Google Cloud Run

AI DevSummit Bonus ?? FREE $1000 Cloudchipr Credit ?? Google AI Data Center Plans ??

AI Bare Metal and Orchestration Platform by InfraCloud

AWS re:Invent 2024 Highlights

Innovative AI Solutions: Edvenswa’s Approach to Leveraging AWS Infrastructure

Adopting Function-as-a-Service (FaaS) for AI workflows

Natu-Natu of Cloud and AI - dawn of a new era?

Cost Optimization Techniques for AI-Driven Microservices Architectures in Azure Cloud: A Deep Dive

Top 5 Strategies AWS Partners Use to Leverage AWS Infrastructure for Generative AI

The Amazon AWS GenAI Strategy Comes with a Big Q

Kubernetes

ML Engineer

领英推荐

App Developer

Summary?

Arun Gupta的更多文章

A Cruise Trip to Antarctica

Announcing My Latest Book: Fostering Open Source Culture

TEDAI Hackathon 2024 Wrapup

Ten Tips for a Healthy Mind at Work

GenAI RAG Chatbot on AWS, Microsoft Azure, and Google Cloud

Mount Kilimanjaro, Serengeti, and Workplace

Mental Health Awareness Month - May 2024

La Vie En KubeCon Paris

Developer webinars for 7th-gen AWS instances

'Mo' Power: Shaping Men's Health

社区洞察

其他会员也浏览了

Serverless MLflow Tracking in Google Cloud Run

AI DevSummit Bonus ?? FREE $1000 Cloudchipr Credit ?? Google AI Data Center Plans ??

AI Bare Metal and Orchestration Platform by InfraCloud

AWS re:Invent 2024 Highlights

Innovative AI Solutions: Edvenswa’s Approach to Leveraging AWS Infrastructure

Adopting Function-as-a-Service (FaaS) for AI workflows

Natu-Natu of Cloud and AI - dawn of a new era?

Cost Optimization Techniques for AI-Driven Microservices Architectures in Azure Cloud: A Deep Dive

Top 5 Strategies AWS Partners Use to Leverage AWS Infrastructure for Generative AI

The Amazon AWS GenAI Strategy Comes with a Big Q