DATA Pill #073 - Building ETL pipelines with Generative AI, Elementary for dbt

DATA Pill #073 - Building ETL pipelines with Generative AI, Elementary for dbt


Hi,

Data folks, are you ready for this week's pill?

Let’s dig into LLMs, FastAPI for ML Model Serving, ETL pipelines with Generative AI and way more.

Enjoy!



ARTICLES

LLM Apps Are Mostly Data Pipelines | 7 min | LLM | Pat Nadolny | Meltano Blog

While tools like LangChain and LlamaIndex help create LLM apps, their data loading capabilities could be more suitable for production. If you're building a robust data pipeline for LLM apps, using a dedicated EL tool is advisable.


How to Optimize FastAPI for ML Model Serving | 6 min | ML | Luis Sena | Personal Blog

This one touches on FastAPI for serving machine learning models, focusing on the shift from synchronous to asynchronous programming and the performance implications of combining I/O and CPU-bound tasks. It shows practical solutions, such as leveraging ProcessPoolExecutor, to ensure efficient I/O task handling while effectively managing concurrent requests.


How Whatnot Utilizes Generative AI to Enhance Trust and Safety | 6 min | LLM | Charudatta (CD) Wad | Whatnot Engineering

This blog discusses how Whatnot utilizes Large Language Models (LLMs) to enhance trust and safety areas, including a multimodal content moderation, fulfillment, bidding irregularities and general fraud protection.


Are You Using Elementary for DBT? | 4 min | Data Engineering | Leo Godin | Personal Blog

This article introduces Elementary, a versatile dbt package and CLI tool designed to simplify the management of data pipelines. It automatically maintains a comprehensive data mart containing essential project metadata, offers an intuitive dashboard for monitoring, and enables Slack alerts for effective communication within data teams, making it an invaluable tool for data and analytics engineers.


In MORE LINKS you will find are you Using Elementary for DBT??

{ MORE LINKS }



TUTORIALS

Chat with your data using OpenAI, Pinecone, Airbyte and Langchain | 14 min | Data Science | Joe Reuter | Airbyte Blog

This tutorial walks you through a real-world use case of how to leverage vector databases and LLMs to make sense out of your unstructured data. By the end of this, you will know how to:?

  • extract unstructured data from a variety of sources using Airbyte?
  • use Airbyte to efficiently load data into a vector database, preparing the data for LLM usage along the way
  • integrate a vector database into your LLM to ask questions about your proprietary data

How to use LLMs for data enrichment in BigQuery? | 16 min | LLM | Piotr Pilis | GetInData | Part of Xebia Blog

In this blog post, Piotr aims to provide a comprehensive guide on how individuals can leverage LLMs and BigQuery to bring additional value to their data and boost their analytical capabilities. They cover use-cases, architectural considerations, implementation steps and evaluation methods to ensure that readers are fully equipped to use this powerful combination in their own data processes.



NEWS

Coming Soon: Confidence — An Experimentation Platform from Spotify | 5 min | Data Engineering | Tyson Singer | Spotify Blog

Spotify introduces Confidence, a commercial product tailored for software development teams, offering an experimentation platform designed to facilitate user tests, including A/B testing and complex scenarios. With a focus on flexibility and scalability, Confidence aims to seamlessly integrate experimentation into organizations, drawing from Spotify's decade-plus experience in enabling large-scale experiments.

In MORE LINKS you will find deploy private LLMs using Databricks Model Serving?

{ MORE LINKS }


DATA TUBE

Making Better Decisions with Cassie Kozyrkov, Google's First Chief Decision Scientist | 1 h 13 min | Data Science | Cassie Kozyrkov | DataCamp

In this episode, Cassie and Richie explore the misconceptions around data science, stereotypes associated with being a data scientist, what the reality of working in data science is, advice for those starting their career in data science, and the challenges of being a data ‘jack-of-all-trades’.



PODCASTS

Building ETL pipelines with Generative AI | 51 min | AI | host: Tobias Macey; guest: Jay Mishra | Data Engineering Podcast

In this episode, Jay Mishra delves into his experiences and knowledge gained from crafting ETL pipelines with the assistance of cutting-edge generative AI technology.

Addressed topics:

  • What are the different aspects/types of ETL that are seeing generative AI applied?
  • What are the most interesting, innovative or unexpected ways generative AI is used in ETL workflows?
  • What are the most interesting, unexpected or challenging lessons that Jay has learned while working on ETL and generative AI?

And more.


Transformative role of AI in education | 1 h 40 min | AI | hosts: Anders Arpteg & Henrik G?thberg; guest: Anders Enstr?m | AIAW Podcast

This episode offers a comprehensive look at AI's role in learning and is a must-listen for those curious about the future of education.




CONFS, EVENTS, AND MEETUPS

MOPS - Meetup ?| Onsite | 10th October 2023

This meetup includes three 25-minute practical talks and a 5-minute question-and-answer session.

The presentations that will be given:

  • Portable and cloud-agnostic MLOps Platform with Kedro, MLflow and Terraform
  • Architecting ML for Huge Impact and Scale
  • Building machine learning platforms for a living - lessons learned

Build conversational AI experiences powered by LLMs with Vertex AI Conversation and Dialogflow CX | Webinar | 24th October 2023

In this session, join Google Cloud Champion Innovator and Google Developer Expert in Machine Learning, Xavier Portilla Edo, and Developer Advocate, Alessia Sacchi, to learn:?

  • The latest Generative AI features in Vertex AI Conversation and Dialogflow CX
  • How to combine traditional agent design techniques and best practices with Google's latest generative large language models (LLMs) to create complex conversational applications that are prepared to handle the many different ways that users might interact with it
  • Examples of tasks that Conversational AI, Search and LLMs can solve, including demos?

________________________

Have any interesting content to share in the DATA Pill newsletter?

? Join us on GitHub

? Dig previous editions of DataPill ?


Adam from the GetInData | Part of Xebia

SAJJAD HAIDER

Turn Your Idea into a Product and Achieve 1000% Growth in 6 Months.

1 年

It looks like you've got a packed week ahead with a fantastic lineup of data insights and events! The range of topics, from data pipelines and ML model serving to generative AI and decision science, is incredibly intriguing. Wishing you a productive and insightful week, and I'm sure these discussions will lead to valuable takeaways

回复

Thanks for sharing!

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了