Week of August 5th
Stefan Krawczyk
CEO @ DAGWorks Inc. | Co-creator of Hamilton & Burr | Pipelines & Agents: Data, Data Science, Machine Learning, & LLMs
TL;DR:
?? Congrats to British Cycling for winning gold in the women's team sprint!
For those that don't know, the British Cycling team is very technology forward. They were an early adopter of Hamilton and use it for their telemetry processing pipeline to extract insights and improve athlete outcomes (in the velodrome events IIRC). We couldn't be more excited that Hamilton played a part in the team's preparations for this year's Olympic games. ??
Hamilton Release Notes:
Hamilton Framework == 1.73.1
?? Fixes:
? ?? Documentation / Examples :
Burr Updates ??
Burr crossed 1000 ??s on GitHub this week! Couldn't be more excited about the progress and pace of adoption.
Burr == 0.26.0
New features/improvements:
Really excited about attribute logging -- this works out of the box with the Burr tracking client, and can be extended to work with other LLM visibility systems. Take this, for example:
领英推荐
from burr.visibility import TracingFactory
from burr.core import action
@action(reads=['prompt'], writes=['response'])
def my_action(state: State, __tracer: TracingFactory) -> State:
# log attributes
__tracer.log_attribute(
prompt_length=len(state["prompt"]),
prompt=state["prompt"]
)
# create spans and log within a span
with __tracer('create_prompt') as t:
modified_prompt = _modify_prompt(state["prompt"])
t.log_attribute(modified_prompt=modified_prompt)
# create spans and log within a span
with __tracer('call_llm', dependencies=['create_prompt']) as t:
response = _query(prompt=modified_prompt)
t.log_attribute(response=response.message,
tokens=response.tokens)
return state.update({'response': response.message})
Blog Posts
Data Quality with Hamilton + Pandera
This post is long overdue. It covers some of the functionality that Hamilton provides with check_output and check_output_custom decorators along with showing Hamilton's Pandera integration. If you're looking for a lightweight alternative to heavy-weight systems like Great Expectations, then this post is for you!
E.g. to validate dataframes with Pandera you can have it all in one place:
import pandera as pa
import pandas as pd
from hamilton.function_modifiers import check_output
@check_output(
schema=pa.DataFrameSchema(
{
'column1': pa.Column(int),
'column2': pa.Column(float, pa.Check(lambda s: s < -1.2)),
# you can provide a list of validators
'column3': pa.Column(str, [
pa.Check(lambda s: s.str.startswith('value')),
pa.Check(lambda s: s.str.split('_', expand=True).shape[1] == 2)
]),
},
index=pa.Index(int),
strict=True,
),
importance="fail"
)
def dataframe_transform(...) -> pd.DataFrame:
... # your logic
Modeling Pregnancy Due Dates with Hamilton
This is a fun post by Elijah ben Izzy that shows what he did with Hamilton in anticipation of becoming a father. Baby came and all is healthy. Congrats Elijah ben Izzy .
Hamilton OS Meetup Group
Reminder there's no meet-up in July. But we have August scheduled. Join/sign-up here. We're excited to have Gilad Rubin speak about some of the work he's been doing on Hamilton.
In the wild
Burr mentioned on LinkedIn
There was another post that talks about pains with LangChain. I'll spare you the details -- you can read them here. What's exciting to us, is the common thread that we're seeing with these types of posts, as Burr is being mentioned as a great candidate for building GenAI applications with. We think this is a testament to our philosophy of making it easy to customize, but then also straightforward to take to production. Haven't tried Burr yet? Watch my latest video that walks through a notebook showing some of the features of Burr.