Week of April 22nd
Stefan Krawczyk
CEO @ DAGWorks Inc. | Co-creator of Hamilton & Burr | Pipelines & Agents: Data, Data Science, Machine Learning, & LLMs
Here's the TL;DR:
Have a great week - dive in below for more details!
Heavybit Blog
The nice folks at Heavybit interviewed me for my thoughts on why MLOps is hard and why companies struggle with it and then wrote a blog about it. I'm not the only one interviewed, so there's some varying perspectives; it's a good read.
Data Council Recording
Elijah ben Izzy , the other co-creator of Hamilton & Burr, gave a talk on what platform initiatives should be doing, and showcased that with what we've been building with Hamilton. It's a great viewing for anyone doing MLOps/LLMOps or thinking of centralization & standardization i.e. "building a platform".
Maven Lightning Session
I'll be doing a 30 minute free session titled "Build a Document Processing Pipeline for RAG Systems ".
What I'll cover:
Why am I doing this?
Retrieval Augmented Generation or RAG is a ?? hot topic. But to use RAG you need to have data to retrieve. Most commonly in organizations this data is in some form of document. Understanding the "what" and "how" of creating a document processing pipeline will enable you to move faster and make better decisions as you build out your RAG system.
Hamilton Meet-up Recording
Last week we had our Hamilton meet-up. In it we covered:
1.59.0 Hamilton Release
?? New Features:
领英推荐
@resolve_from_config(
decorate_with=lambda columns_to_sum_map: parameterize(
**{
key: {"col_1": source(value[0]), "col_2": source(value[1])}
for key, value in columns_to_sum_map.items()
}
),
)
def generic_summation(col_1: pd.Series, col_2: pd.Series) -> pd.Series:
...
This is Jan Hurst ’s first contribution to Hamilton! ??
To use it:
from hamilton.plugins import h_pyarrow
result_builder = h_pyarrow.PyarrowTableResult()
# pass to Builder().adapters(), or to a DataSaver (i.e. materializer)
?? Documentation / Examples:
Hamilton Blog
New blog post & tutorial this week from our blog courtesy of Thierry Jean !
We cover how to use #Hamilton for ad-hoc analyses in a notebook and how it's not a big change to your workflow. The end result is that it helps you structure your analyses easily, that also coincidentally enables you to easily reuse or extend, or even productionize your work!
Thanks to People Data Labs for the data that we used in this post to make it more realistic -- you can download the data and play with it too!
Links:
Burr Blog
We also published a Burr blog this past week . In it we describe how to build an interactive agent with Burr. We believe most agent workflows should be designed to have humans-in-the-loop. This is what we're designing Burr for and why we think it's different -- it should be easy to build an agent application and inject human oversight into it.
In the blog we use the example of building a simple Email Assistant agent that can help you write a response to an email. The blog describes how to build the application in #Burr, and also run it on FastAPI . We don't dive into the details of it, but there's also an example UI that one can use to play around with it.
> pip install "burr[start]"
> burr # to start the burr server
# navigate to demos and use
Thanks that's all for this week!