Subject: ?? DATA Pill #098 - Deploy LLM in your Private Kubernetes Cluster, The Real Cost of Self-Hosting MLflow

Subject: ?? DATA Pill #098 - Deploy LLM in your Private Kubernetes Cluster, The Real Cost of Self-Hosting MLflow

Hi,

Another week, another data meat ready to serve.

The Skill Lake section strikes again! Join Data Learning Week.

Also, enjoy the tutorial on deploying LLM in your private Kubernetes cluster in 5 steps and more we found this week.

ARTICLES

Data Quality Error Detection powered by LLMs | 17 min | LLM | Simon Grah | Towards Data Science Blog

Read the first review of the introductory article on the Data Dirtiness Score, which explains the key assumptions and demonstrates how to calculate this score. It's the second in a series about cleaning data using Large Language Models (LLMs), with a focus on identifying errors in tabular data sets.

Unlocking Kafka's Potential: Tackling Tail Latency with eBPF | 7 min | Data Engineering | Maciej Mo?cicki, Piotr R?ysko | Allegro Tech Blog

This blog post describes Allegro’s team journey — how they used Kafka protocol sniffing and eBPF to identify and remove the performance bottleneck.


Evaluating Large Language Model (LLM) systems: Metrics, challenges, and best practices | 11 min | LLM | Jane Huang, Kirk Li, Daniel Yehdego | Data Science at Microsoft

This article thoroughly examines LLM system evaluation, distinguishing between model and system evaluation and scrutinizing online and offline strategies. It focuses on AI assessing AI and Responsible AI metrics. The article highlights the relevance of diverse evaluation tools and frameworks across application scenarios, urging readers to stay informed about evolving metrics and frameworks for a comprehensive understanding.

In MORE LINKS you will read about: How we expose data in BigQuery, The Real Cost of Self-Hosting MLflow

{ MORE LINKS }

SKILL LAKE

Data Learning Week | Online | 8-11th April

Would you like to test one of our courses before investing money in it? Then come to our Data Learning Week, a series of 4 free hands-on workshops. Each session is a free first-trial lesson for the full training. We will also have a special bonus from the Academy for all workshop participants.?

Choose your topic, check agenda and sign up:

TUTORIALS

Deploy a custom Docker image on Azure ML using a blue-green deployment with Python | 13 min | ML | Timo Uelen | Xebia Blog

This tutorial dives into such a custom solution:

  • Deploy our ML model using a custom Docker image.
  • Use a blue-green deployment strategy to ensure there is no downtime when deploying our model.
  • Run smoke tests to see if our deployment is working as expected, before we replace our previous model.
  • Use the Azure ML Python SDK to configure and manage deployment to Azure ML.

DATA TUBE

How to Deploy LLM in your Private Kubernetes Cluster in 5 STEPS | 17 min | LLM | Marcin Zab?ocki | GetInData | Part of Xebia

In this tutorial, Marcin Zab?ocki shows how to deploy LLM in your private Kubernetes cluster in 5 simple steps on the Mistral example.?

In MORE LINKS you will read about: Streams Forever: Kafka Summit London 2024 Keynote

{ MORE LINKS }

PODCAST

ML for Finance and Storytelling through Data | 1 h 7 min | ML | Daniel Bashir, Ben Wellington

On challenges for ML in quantitative trading and investing, and telling stories through data.

CONFS EVENTS AND MEETUPS

Big Data Technology Warsaw Summit | Warsaw and Online | 10th and 11th April

Join the independent conference with an agenda with presentations arranged into nine categories – find your most desired topics! There are, for example:

  • Data Engineering
  • Streaming and real-time analytics
  • ML & Data Science
  • Gen AI

And more! Learn from speakers from companies like Dropbox, IKEA, Cloudera, Allegro, Ververica, and Freenow.?

Shhh… Use the DataPill200 code to get the 200 PLN discount!

________________________

Have any interesting content to share in the DATA Pill newsletter?

? Join us on GitHub

? Dig previous editions of DataPill?

Adam from the GetInData | Part of Xebia

Awesome roundup in Data PILL #98! ?? It's super impressive how you highlighted the practical applications of LLMs and tackled topics like data quality error detection. Digging into the ethical use of data and LLMs could be a fantastic next step to sharpen your understanding even further. Have you thought about how these skills might shape your dream job in the tech world?

回复

要查看或添加评论,请登录

Adam Kawa的更多文章

社区洞察

其他会员也浏览了