Subject: ?? DATA Pill #124 - SQL Has Problems, RAG API, QueryGPT

Subject: ?? DATA Pill #124 - SQL Has Problems, RAG API, QueryGPT

Hi,

Welcome to this week's DATA Pill! I've packed this edition with cutting-edge insights on scaling data pipelines, smarter SQL, and AI-driven tools. Dive in and level up your skills!

ARTICLES

Industrial IoT Middleware for Edge and Cloud OT/IT Bridge powered by Apache Kafka and Flink | 11 min | IoT / Data Streaming | Kai Waehner | Personal Blog

Explore how Apache Kafka and Flink bridge real-time operations and business systems, enabling predictive maintenance and smart decision-making through seamless data flow in industrial IoT environments.

I spent 5 hours learning how ClickHouse built their internal data warehouse | 8 min | Data Warehouse | Vu Trinh | Personal Blog

A deep dive into how ClickHouse engineers developed and optimized their internal data warehouse to process 50 TB of data daily. Discover their strategies for performance enhancement and scaling.


TUTORIALS

QueryGPT – Natural Language to SQL Using Generative AI | 5 min | AI | Jeffrey Johnson, Callie Busch, Abhi Khune, Pradeep Chakka | Uber Engineering Blog

Discover QueryGPT, Uber’s innovative tool that transforms natural language into SQL queries using generative AI, drastically reducing the time required to generate complex queries.

In MORE LINKS you will read:

  • Generate a preference dataset
  • ETL for Beginners: Data Ingestion at Scale with S3 and Snowflake

{ MORE LINKS }

DATA LIBRARY

SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL | Database Technologies | Jeff Shute, Shannon Bales, Matthew Brown, Jean-Daniel Browne, Brandon Dolphin, Romit Kudtarkar, Andrey Litvinov, Jingchi Ma, John Morcos, Michael Shen, David Wilhite, Xi Wu, Lulan Yu | Google Research

GoogleSQL introduces piped data flow syntax to address usability and extensibility challenges in SQL, making it more flexible and user-friendly without significant system changes.

In MORE LINKS you will read:

  • Learnings from a Large-Scale Deployment of an LLM-Powered Expert-in-the-Loop Healthcare Chatbot

{ MORE LINKS }

PODCAST

Process mining with LLMs | 26 min | LLM | Kyle Polich, David Obembe | Data Skeptic Podcast

David Obembe discusses how LLMs can enhance process mining tools, sharing insights from his research on conversational interfaces and future advancements using RAG.

DATA TUBE

RAG API - 30 lines of code is all you need for RAG | 23 min | ML | Sascha Heyer | ML Engineer

Learn how to implement RAG with minimal code using Google Cloud's RAG API, providing an efficient way to retrieve and integrate relevant documents for smarter query responses.

In MORE LINKS you will watch:

  • Looking under the hood at the tech stack that powers multimodal AI

{ MORE LINKS }

CONFS, EVENTS AND MEETUPS

LLMOps: from Demo to Production-Ready GenAI Systems | Webinar | 22th October

Dive into the world of LLMOps to learn how to transition from demo applications to production-grade systems, tackling challenges like prompt sensitivity, cost control, and model tuning.

Infoshare DEV | Gdynia | 16th October?

2 stages dedicated entirely to the latest technologies await you at this conference.

What topics does it cover?

Architecture | AI/ML | Data Science | DevOps & Cloud | People & Culture | Java | Tests | UX | Front-end | CyberSecurity | Programming

And, since we are the community partner of Infoshare DEV, we have a discount code! Use DEV24-DP10 code to get the 10% discount.

_______________________

Have any interesting content to share in the DATA Pill newsletter?

? Join us on GitHub

? Dig previous editions of DataPill

Adam from the GetInData | Part of Xebia

要查看或添加评论,请登录

社区洞察

其他会员也浏览了