DATA Pill #095 - Real-Time RAG, pick between Kimball, One Big Table, and Relational Modeling

DATA Pill #095 - Real-Time RAG, pick between Kimball, One Big Table, and Relational Modeling

Hi,


Monday evening, it’s time to take your DATA Pill.

As always, great content and a little surprise - a promo code for the conference.

Enjoy!?

ARTICLES

Apache Kafka is NOT real real-time data streaming! | 4 min | Data Streaming | Kai Waehner | Personal Blog

This blog post explores the architecture of NASDAQ that combines critical stock exchange trading with low-latency streaming analytics.

Hive Metastore – Did We Replace It With A Vendor Lock? | Oz Katz, Einat Orr | 7 min | Data Engineering | lakeFS blog

This blog considers in what sense Hive’s Metastore is “open” and why we believe the leading candidates to replace it are closed, in a way that is meant to limit us to using a specific vendor’s data ecosystem.

News Recommendation: the challenging area in building recommendation systems | 8 min | Recommendation Systems | Adam Cierlik | GetInData | Part of Xebia Blog

Exploring the ever-changing world of news recommendation systems? This blog dives deep into how to blend user preferences with real-time news context for a genuinely personalized reading experience.

In MORE LINKS you will read about: Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data Platform and A Deep Dive into the Latest Performance Improvements of Stateful Pipelines in Apache Spark Structured Streaming

{ MORE LINKS }




TUTORIALS

Evaluate LLMs with Hugging Face Lighteval on Amazon SageMaker | 8 min | LLM? | Philipp Schmid? | Personal Blog

Let’s learn how to evaluate LLMs using Hugging Face lighteval. LightEval supports the evaluation suite used in Hugging Face Open LLM Leaderboard.

In MORE LINKS you will read about: Easy Introduction to Real-Time RAG

{ MORE LINKS }



DATA TUBE

How to pick between Kimball, One Big Table, and Relational Modeling as a data engineer | 42 min | Data Engineering | Data with Zach

We'll be covering:?

- When to use One Big Table modeling vs Kimball

- How to use Struct and Array and Array of Struct to get what you want




PODCAST

Optimizing both hardware and software for GenAI | 26 min | Gen AI | Ryan Donovan, Raymond Lo | The Stack Overflow Podcast

Ryan and Ben chat with Raymond Lo, AI software evangelist at Intel, about the AI PC, the software that powers AI breakthroughs, and optimizing hardware and software in unison to improve generative AI performance. Bonus: what’s the difference between a GPU optimized for graphics and a VPU or NPU optimized for AI??




CONFS EVENTS AND MEETUPS

Big Data Technology Warsaw Summit | Warsaw and Online | 10th and 11th April

Join the independent conference with an agenda with presentations arranged into nine categories – find your most desired topics! There are, for example:

  • Data Engineering
  • Streaming and real-time analytics
  • ML & Data Science
  • Gen AI

And more! Learn from speakers from companies like Dropbox, IKEA, Cloudera, Allegro, Ververica, and Freenow.?

Shhh… Use the DataPill200 code to get the 200 PLN discount!

Journey to the Cloud | Zurich | 20th March

Gain expert insights into migrating sensitive workloads securely and optimizing costs. Dive into detailed case studies, including the migration and modernization journeys of Just Eat Takeaway.com and Truecaller, to see these principles in action.

Don't miss out on this invaluable opportunity to learn from industry leaders and propel your business forward with confidence!

________________________

Have any interesting content to share in the DATA Pill newsletter?

? Join us on GitHub

? Dig previous editions of DataPill ?

Adam from the GetInData | Part of Xebia

Zach Wilson

DataExpert.io 创始人 | 高级数据工程师| 7年经验FAANG工程师

8 个月

I worked with Troy! He’s awesome

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了