登录查看更多内容

New Beta APIs. Ray On more versions and Cloud Run News.

Abdelfettah SGHIOUAR

Senior Cloud Developer Advocate | Podcaster | Speaker | CNCF Ambassador | Kubestronaut | Human

发布日期: 2024年10月12日

+ 关注

The News

GKE

New beta APIs were added in Kubernetes 1.31: The networking.k8s.io/v1beta1/ipaddresses and networking.k8s.io/v1beta1/servicecidrs APIs are now available, enabling KEP-1880 - Multiple Service CIDRs and Extended Service IP ranges capabilities in GKE.
Ray Operator is available on GKE 1.29: Kubernetes 1.29 and 1.30 were added to the supported versions for the Ray Operator Addon. Previously, only 1.31 was supported.
Parallelstore CSI driver is GA for GKE: Parallelstore accelerates AI/ML training and excels at saturating individual compute clients, ensuring that expensive compute resources are efficiently used.
GPU and TPU distribution notification in GKE: 1.30.3-gke.1639000 and later and 1.31.0-gke.1058000 and later. GKE can notify you before a shutdown and gracefully terminate your workloads.

The Editorial

Michael Spencer 2 年前

When Worlds Collide: The VAST Data Platform Is Now…

VAST Data 4 个月前

Azure and .NET Digest #3: New Virtual Machines…

Victor Karabedyants 1 个月前

GKE

Managing AI/ML Costs on GKE: GKE is one of the best ways to train and service ML Models on Google Cloud. The flexibility and scalability it offers is unmatched. But cost management becomes an essential factor. In this blog, you will learn the techniques and features you can use to manage costs in GKE.
Connect from GKE to AlloyDB: Learn how to connect a workload running in GKE to AlloyDB for PostgreSQL.
Why We Disabled GKE Image Streaming: Just because a feature is available doesn't always mean we have to enable it and use it. This is an excellent example of when Image Streaming doesn’t work.
Mitigate IPv4 exhaustion with Class E IPv4: IPv4 exhaustion is a prevalent issue for GKE users, especially in large organizations. Luckily, there are ways to mitigate that, like using the Class E IPv4 range (240.0.0.0/4). This tutorial addresses that by going into the details of how this works and some caveats to consider.
Run Nvidia NIM on GKE: Nvidia NIM Microservices are containerized microservices that allow you to run base models on various environments, including Kubernetes. They are now available on GKE. Supported models are meta/llama-3.1-70b-instruct, mistralai/mixtral-8x7b-instruct-v0.1 and nvidia/nv-embedqa-e5-v5.

Google Cloud

Cloud Run GPU (Preview) is available in asia-southeast1.
Cloud Run multi-region deployment available from the CLI (Preview).
Cloud Run Base Image Updates (Preview): You can enable this feature and let Cloud Run automatically update your base image. You don't need to rebuild and redeploy the application. Just turn the feature on, and a new revision will be created with the patches applied to the base Image.
Make your LLMs serverless: GPU Support in Cloud Run was made public a few weeks ago. Now, we start to see tutorials on how to use it. This article from one of our champions is a good resource.
Workload Identity Federation for GitHub Actions: To authenticate to resources in Google Cloud from the outside world (Exp: Github Actions), you no longer need long-lived Service Account Keys. It would be best if you use Workload Identity Federation. This tutorial will teach you how to set it up with a concrete example.
Best practices for securing SSH to VMs: Although SSH sounds basic, it continues to cause many security issues. We share a set of best practices for approaching this basic configuration, which everyone needs.
Accessing private resources on GCP with Tailscale: The best way to secure cloud resources is to keep them private. However, keeping them private introduces a new productivity challenge for Devs and Ops. Tailscale is an excellent tool for balancing security and productivity. This tutorial explains how Tailscale can be used to access private resources on GCP.
Three ways to run Apache Airflow on GCP: Apache Airflow is a popular open-source ETL. This blog covers various options for running Airflow on Google Cloud, including Compute Engine, GKE Autopilot, and Cloud Composer.
CMEK Best Practices: Customer-managed Encryption Keys are the process by which you (the customer) manage your encryption keys in KMS (Key Management System) instead of relying on those managed by Google Cloud. In other words, you manage the keys' lifecycles (create, rotate, delete). Read these best practices to learn how to do that properly.
Migrating Images from AWS ECR to Google Artifact Registry: Do you need to move your images from AWS ECR To Google Cloud AR? Because, you know, Google Cloud is kind of better at running container workloads! This tutorial has scripts and configuration steps to make that easy.

Miscellaneous

Istio for People Who Have Stuff to Do: This short read is an excellent introduction to Istio Service Mesh if you don’t have time because you are reading this fabulous newsletter ;)
Connecting external VMs to Istio: Sometimes, even if your workload is Cloud Native and container-ready, you might still have legacy applications running on VMs. If you need to secure communication with these VMs from your Istio-enabled cluster, you can add the VMs to the Mesh. This article explains how to achieve that.

Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

1 个月

The focus on AI/ML cost management and efficiency is timely given the resource demands of large language models. It's interesting to see how Ray Operator integration with GKE 1.29 might impact distributed training workflows. The use of Parallelstore CSI driver for boosting training efficiency suggests a shift towards more specialized storage solutions. Do you think this trend will lead to the emergence of dedicated AI/ML data fabrics within Kubernetes clusters?

要查看或添加评论，请登录

查看全部

New Beta APIs. Ray On more versions and Cloud Run News.

Abdelfettah SGHIOUAR

Senior Cloud Developer Advocate | Podcaster | Speaker | CNCF Ambassador | Kubestronaut | Human

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

The Rise of Alternative Clouds: Revolutionizing GPU Accessibility for AI

AWS Cost Optimizations with Graviton CPUs: A Guide

Serverless AI infrastructure

Cost Optimization Techniques for AI-Driven Microservices Architectures in Azure Cloud: A Deep Dive

Will NVIDIA Win the AI Cloud Battle Against Hyperscalers?

Network Policies, Rebranding and Volumes for Cloud Run

Big Cloud Embraces Serverless AI

How Cloud GPU Servers are Empowering AI Projects?

Vecima Blog – Connecting Communities with the Cloud

How Much Does It Cost to Self-Host an LLM? A Comprehensive Cost Analysis

领英推荐

KCC Direct Controllers, Compact Placement Policies and a new Cloud Region In Mexico????

2024年11月22日

Thank you for > 10k Subscribers ??

2024年10月25日

Ray on GKE, cheaper storage, binding blocking and Valkey

2024年9月20日

Kubernetes 1.31, GPU’s in Cloud Run and Secrets

2024年8月29日

Custom Compute Classes, CIS Benchmarks and Cloud Run AI

2024年8月15日

Kube-state-metrics, cAdvisor and Kubelet. Boring product updates and Dev Survey

2024年8月2日

GKE Extended Support. Static Pods IP’s and Ray Add-on

2024年7月26日

Dymanic Workload Schedule GA, Confidential Computing and KMS Autokey

2024年7月12日