Subject: ?? DATA Pill #124 - SQL Has Problems, RAG API, QueryGPT
Hi,
Welcome to this week's DATA Pill! I've packed this edition with cutting-edge insights on scaling data pipelines, smarter SQL, and AI-driven tools. Dive in and level up your skills!
ARTICLES
Industrial IoT Middleware for Edge and Cloud OT/IT Bridge powered by Apache Kafka and Flink | 11 min | IoT / Data Streaming | Kai Waehner | Personal Blog
Explore how Apache Kafka and Flink bridge real-time operations and business systems, enabling predictive maintenance and smart decision-making through seamless data flow in industrial IoT environments.
I spent 5 hours learning how ClickHouse built their internal data warehouse | 8 min | Data Warehouse | Vu Trinh | Personal Blog
A deep dive into how ClickHouse engineers developed and optimized their internal data warehouse to process 50 TB of data daily. Discover their strategies for performance enhancement and scaling.
TUTORIALS
QueryGPT – Natural Language to SQL Using Generative AI | 5 min | AI | Jeffrey Johnson, Callie Busch, Abhi Khune, Pradeep Chakka | Uber Engineering Blog
Discover QueryGPT, Uber’s innovative tool that transforms natural language into SQL queries using generative AI, drastically reducing the time required to generate complex queries.
In MORE LINKS you will read:
DATA LIBRARY
SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL | Database Technologies | Jeff Shute, Shannon Bales, Matthew Brown, Jean-Daniel Browne, Brandon Dolphin, Romit Kudtarkar, Andrey Litvinov, Jingchi Ma, John Morcos, Michael Shen, David Wilhite, Xi Wu, Lulan Yu | Google Research
GoogleSQL introduces piped data flow syntax to address usability and extensibility challenges in SQL, making it more flexible and user-friendly without significant system changes.
In MORE LINKS you will read:
领英推荐
PODCAST
Process mining with LLMs | 26 min | LLM | Kyle Polich, David Obembe | Data Skeptic Podcast
David Obembe discusses how LLMs can enhance process mining tools, sharing insights from his research on conversational interfaces and future advancements using RAG.
DATA TUBE
RAG API - 30 lines of code is all you need for RAG | 23 min | ML | Sascha Heyer | ML Engineer
Learn how to implement RAG with minimal code using Google Cloud's RAG API, providing an efficient way to retrieve and integrate relevant documents for smarter query responses.
In MORE LINKS you will watch:
CONFS, EVENTS AND MEETUPS
LLMOps: from Demo to Production-Ready GenAI Systems | Webinar | 22th October
Dive into the world of LLMOps to learn how to transition from demo applications to production-grade systems, tackling challenges like prompt sensitivity, cost control, and model tuning.
Infoshare DEV | Gdynia | 16th October?
2 stages dedicated entirely to the latest technologies await you at this conference.
What topics does it cover?
Architecture | AI/ML | Data Science | DevOps & Cloud | People & Culture | Java | Tests | UX | Front-end | CyberSecurity | Programming
And, since we are the community partner of Infoshare DEV, we have a discount code! Use DEV24-DP10 code to get the 10% discount.
_______________________
Have any interesting content to share in the DATA Pill newsletter?
? Join us on GitHub
? Dig previous editions of DataPill
Adam from the GetInData | Part of Xebia