- Managing AI/ML Costs on GKE: GKE is one of the best ways to train and service ML Models on Google Cloud. The flexibility and scalability it offers is unmatched. But cost management becomes an essential factor. In this blog, you will learn the techniques and features you can use to manage costs in GKE.
- Connect from GKE to AlloyDB: Learn how to connect a workload running in GKE to AlloyDB for PostgreSQL.
- Why We Disabled GKE Image Streaming: Just because a feature is available doesn't always mean we have to enable it and use it. This is an excellent example of when Image Streaming doesn’t work.
- Mitigate IPv4 exhaustion with Class E IPv4: IPv4 exhaustion is a prevalent issue for GKE users, especially in large organizations. Luckily, there are ways to mitigate that, like using the Class E IPv4 range (240.0.0.0/4). This tutorial addresses that by going into the details of how this works and some caveats to consider.
- Run Nvidia NIM on GKE: Nvidia NIM Microservices are containerized microservices that allow you to run base models on various environments, including Kubernetes. They are now available on GKE. Supported models are meta/llama-3.1-70b-instruct, mistralai/mixtral-8x7b-instruct-v0.1 and nvidia/nv-embedqa-e5-v5.
- Cloud Run GPU (Preview) is available in asia-southeast1.
- Cloud Run multi-region deployment available from the CLI (Preview).
- Cloud Run Base Image Updates (Preview): You can enable this feature and let Cloud Run automatically update your base image. You don't need to rebuild and redeploy the application. Just turn the feature on, and a new revision will be created with the patches applied to the base Image.
- Make your LLMs serverless: GPU Support in Cloud Run was made public a few weeks ago. Now, we start to see tutorials on how to use it. This article from one of our champions is a good resource.
- Workload Identity Federation for GitHub Actions: To authenticate to resources in Google Cloud from the outside world (Exp: Github Actions), you no longer need long-lived Service Account Keys. It would be best if you use Workload Identity Federation. This tutorial will teach you how to set it up with a concrete example.
- Best practices for securing SSH to VMs: Although SSH sounds basic, it continues to cause many security issues. We share a set of best practices for approaching this basic configuration, which everyone needs.
- Accessing private resources on GCP with Tailscale: The best way to secure cloud resources is to keep them private. However, keeping them private introduces a new productivity challenge for Devs and Ops. Tailscale is an excellent tool for balancing security and productivity. This tutorial explains how Tailscale can be used to access private resources on GCP.
- Three ways to run Apache Airflow on GCP: Apache Airflow is a popular open-source ETL. This blog covers various options for running Airflow on Google Cloud, including Compute Engine, GKE Autopilot, and Cloud Composer.
- CMEK Best Practices: Customer-managed Encryption Keys are the process by which you (the customer) manage your encryption keys in KMS (Key Management System) instead of relying on those managed by Google Cloud. In other words, you manage the keys' lifecycles (create, rotate, delete). Read these best practices to learn how to do that properly.
- Migrating Images from AWS ECR to Google Artifact Registry: Do you need to move your images from AWS ECR To Google Cloud AR? Because, you know, Google Cloud is kind of better at running container workloads! This tutorial has scripts and configuration steps to make that easy.
- Istio for People Who Have Stuff to Do: This short read is an excellent introduction to Istio Service Mesh if you don’t have time because you are reading this fabulous newsletter ;)
- Connecting external VMs to Istio: Sometimes, even if your workload is Cloud Native and container-ready, you might still have legacy applications running on VMs. If you need to secure communication with these VMs from your Istio-enabled cluster, you can add the VMs to the Mesh. This article explains how to achieve that.
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
1 个月The focus on AI/ML cost management and efficiency is timely given the resource demands of large language models. It's interesting to see how Ray Operator integration with GKE 1.29 might impact distributed training workflows. The use of Parallelstore CSI driver for boosting training efficiency suggests a shift towards more specialized storage solutions. Do you think this trend will lead to the emergence of dedicated AI/ML data fabrics within Kubernetes clusters?