?? DATA Pill #102 - 50 Years of SQL, dbt + Airflow = ?

?? DATA Pill #102 - 50 Years of SQL, dbt + Airflow = ?

Hi,

This week was full of tutorials, so we obviously are sharing these ones with you.

Enjoy the newest DATA Pill and get your dose of knowledge.

ARTICLES

dream Distributed RAG Experimentation Framework | 4 min | RAG | Aishwarya Prabhat | Personal Blog

DREAM is a Distributed RAG Experimentation Framework that leverages a Kubernetes-native architecture to streamline the testing and evaluation of RAG techniques. It utilizes technologies like Ray, LlamaIndex, and MLFlow to facilitate distributed computing and detailed experiment tracking. This framework improves the efficiency of determining optimal RAG configurations for specific use cases.

Building data abstractions with streaming at Yelp | 5 min | Data Streaming | Hakampreet Singh Pandher | Yelp Blog

This blog post explores how Yelp utilizes extensive streaming infrastructure to develop robust data abstractions for offline and streaming data consumers. It will illustrate this using Yelp’s Business Properties ecosystem, which is detailed in the following sections.

The Evolution of Real-Time Data Streaming in Business | 7 min | Data Streaming | Klaudia Wachnio | GetInData | Part of Xebia Blog

Dive into a blog post based on the webinar "Real-Time Data to Drive Business Growth and Innovation in 2024" and explore the transformative impact of real-time data streaming.? Discover how leveraging instant data analytics is not just for tech giants but a game changer for businesses across all sectors aiming to drive growth and outpace the competition.

In MORE LINKS you will read about: Unity Catalog Governance in Action: Monitoring, Reporting, and Lineage

{ MORE LINKS }

SKILL LAKE

Introduction to Stream Processing and Apache Flink | 35 min | Data Engineering | Ververica

This foundational course blends theory, practical examples, quizzes, and a final assignment to give students a comprehensive understanding of data processing with Apache Flink. It covers modules on Flink's basics, architecture, SQL API, time handling, fault tolerance, and state backends.?

BTW, we are looking for a Data Engineer with Flink. Check out the offer here.

TUTORIALS

dbt + Airflow = ? | 12 min | Data Engineering | ?Giorgos Myrianthous | Making Plum Blog

To overcome limitations with dbt Cloud, the team built their own integration platform, customizing it to schedule projects, ensure task granularity, and maintain essential dbt dependencies. This strategic move significantly enhanced their control and flexibility in data operations.

In MORE LINKS you will read about:

  • Building a Smarter Shopping Experience: The Technology Behind Conversational Search in E-Commerce
  • Harnessing the Power of Grafana and CockroachDB: Visualizing and Marking Dataset Insights
  • Efficiently fine-tune Llama 3 with PyTorch FSDP and Q-Lora

{ MORE LINKS }

DATA TUBE

50 Years of SQL | 39 min | Data Analytics | Don Chamberlin | DataCamp

Don Chamberlin is renowned as the co-inventor of SQL. In the episode, Richie and Don explore his early career at IBM and the development of his interest in databases alongside Ray Boyce, the database task group (DBTG), the transition to relational databases and the early development of SQL, the commercialization and adoption of SQL, how it became standardized, how it evolved and spread via open source, the future of SQL through NoSQL and SQL++ and much more.?

CONFS EVENTS AND MEETUPS

Effortless Data Orchestration: A Practical Guide to Azure Data Factory and Snowflake | Online | 29th May

Get hands-on with Azure Data Factory and Snowflake. You will demystify ETL/ELT and DataOps to streamline your data pipelines and analytics workflows. You will learn how to seamlessly integrate, transform, and optimize data processing with intuitive, powerful tools.

In this lab, you'll:

  • You can easily set up and run data pipelines in ADF, connecting to various sources and using intuitive tools for efficient ETL.
  • Integrate ADF and Snowflake smoothly, translating data into actionable analytics.
  • Utilize ADF data flows for smart data shaping from Azure SQL to Snowflake, readying your data for insight generation.
  • Leverage Snowflake's push-down computing for better data processing performance.

________________________

Have any interesting content to share in the DATA Pill newsletter?

? Join us on GitHub

? Dig previous editions of DataPill?

Adam from the GetInData | Part of Xebia

Ari Kaplan

Head of Evangelism at Databricks; Caltech Alumni of the Decade; DataIQ Top 20 Influencers in AI; Creator of Chicago Cubs Analytics Department; Dummies book author

7 个月

Thanks for putting me and Pearl Ubaru, MS on the list! Looking forward to browsing the rest

Aishwarya Prabhat

Vice President (AI/ML) @ DBS Bank | GenAI, LLMs, LLMOps, AI, ML, MLOps

7 个月

Thank you for putting the DREAM framework on that list Adam! ??

要查看或添加评论,请登录

社区洞察

其他会员也浏览了