DATA Pill #073 - Building ETL pipelines with Generative AI, Elementary for dbt
Hi,
Data folks, are you ready for this week's pill?
Let’s dig into LLMs, FastAPI for ML Model Serving, ETL pipelines with Generative AI and way more.
Enjoy!
ARTICLES
LLM Apps Are Mostly Data Pipelines | 7 min | LLM | Pat Nadolny | Meltano Blog
While tools like LangChain and LlamaIndex help create LLM apps, their data loading capabilities could be more suitable for production. If you're building a robust data pipeline for LLM apps, using a dedicated EL tool is advisable.
How to Optimize FastAPI for ML Model Serving | 6 min | ML | Luis Sena | Personal Blog
This one touches on FastAPI for serving machine learning models, focusing on the shift from synchronous to asynchronous programming and the performance implications of combining I/O and CPU-bound tasks. It shows practical solutions, such as leveraging ProcessPoolExecutor, to ensure efficient I/O task handling while effectively managing concurrent requests.
How Whatnot Utilizes Generative AI to Enhance Trust and Safety | 6 min | LLM | Charudatta (CD) Wad | Whatnot Engineering
This blog discusses how Whatnot utilizes Large Language Models (LLMs) to enhance trust and safety areas, including a multimodal content moderation, fulfillment, bidding irregularities and general fraud protection.
Are You Using Elementary for DBT? | 4 min | Data Engineering | Leo Godin | Personal Blog
This article introduces Elementary, a versatile dbt package and CLI tool designed to simplify the management of data pipelines. It automatically maintains a comprehensive data mart containing essential project metadata, offers an intuitive dashboard for monitoring, and enables Slack alerts for effective communication within data teams, making it an invaluable tool for data and analytics engineers.
In MORE LINKS you will find are you Using Elementary for DBT??
TUTORIALS
Chat with your data using OpenAI, Pinecone, Airbyte and Langchain | 14 min | Data Science | Joe Reuter | Airbyte Blog
This tutorial walks you through a real-world use case of how to leverage vector databases and LLMs to make sense out of your unstructured data. By the end of this, you will know how to:?
How to use LLMs for data enrichment in BigQuery? | 16 min | LLM | Piotr Pilis | GetInData | Part of Xebia Blog
In this blog post, Piotr aims to provide a comprehensive guide on how individuals can leverage LLMs and BigQuery to bring additional value to their data and boost their analytical capabilities. They cover use-cases, architectural considerations, implementation steps and evaluation methods to ensure that readers are fully equipped to use this powerful combination in their own data processes.
NEWS
Coming Soon: Confidence — An Experimentation Platform from Spotify | 5 min | Data Engineering | Tyson Singer | Spotify Blog
领英推荐
Spotify introduces Confidence, a commercial product tailored for software development teams, offering an experimentation platform designed to facilitate user tests, including A/B testing and complex scenarios. With a focus on flexibility and scalability, Confidence aims to seamlessly integrate experimentation into organizations, drawing from Spotify's decade-plus experience in enabling large-scale experiments.
In MORE LINKS you will find deploy private LLMs using Databricks Model Serving?
DATA TUBE
Making Better Decisions with Cassie Kozyrkov, Google's First Chief Decision Scientist | 1 h 13 min | Data Science | Cassie Kozyrkov | DataCamp
In this episode, Cassie and Richie explore the misconceptions around data science, stereotypes associated with being a data scientist, what the reality of working in data science is, advice for those starting their career in data science, and the challenges of being a data ‘jack-of-all-trades’.
PODCASTS
Building ETL pipelines with Generative AI | 51 min | AI | host: Tobias Macey; guest: Jay Mishra | Data Engineering Podcast
In this episode, Jay Mishra delves into his experiences and knowledge gained from crafting ETL pipelines with the assistance of cutting-edge generative AI technology.
Addressed topics:
And more.
Transformative role of AI in education | 1 h 40 min | AI | hosts: Anders Arpteg & Henrik G?thberg; guest: Anders Enstr?m | AIAW Podcast
This episode offers a comprehensive look at AI's role in learning and is a must-listen for those curious about the future of education.
CONFS, EVENTS, AND MEETUPS
MOPS - Meetup ?| Onsite | 10th October 2023
This meetup includes three 25-minute practical talks and a 5-minute question-and-answer session.
The presentations that will be given:
Build conversational AI experiences powered by LLMs with Vertex AI Conversation and Dialogflow CX | Webinar | 24th October 2023
In this session, join Google Cloud Champion Innovator and Google Developer Expert in Machine Learning, Xavier Portilla Edo, and Developer Advocate, Alessia Sacchi, to learn:?
________________________
Have any interesting content to share in the DATA Pill newsletter?
? Join us on GitHub
? Dig previous editions of DataPill ?
Adam from the GetInData | Part of Xebia
Turn Your Idea into a Product and Achieve 1000% Growth in 6 Months.
1 年It looks like you've got a packed week ahead with a fantastic lineup of data insights and events! The range of topics, from data pipelines and ML model serving to generative AI and decision science, is incredibly intriguing. Wishing you a productive and insightful week, and I'm sure these discussions will lead to valuable takeaways
Thanks for sharing!