Week of October 14th
Stefan Krawczyk
CEO @ DAGWorks Inc. | Co-creator of Hamilton & Burr | Pipelines & Agents: Data, Data Science, Machine Learning, & LLMs
TL;DR:
Announcing Shreya Shankar as an advisor
We're super excited that Shreya Shankar has joined us as an advisor at DAGWorks Inc. For those that don't know Shreya is a PhD at 美国加州大学伯克利分校 whose research area overlaps a lot with what we're trying to achieve with Hamilton & Burr: developing & productionizing data, ML, and AI. She runs a great blog and is someone you should follow on twitter/x.
We're excited that she's excited for what we're building! And looking forward to getting knowledge and insights from her research into what we're building with Hamilton & Burr.
Hamilton Release Highlights:
Hamilton Framework == 1.81.0
Changes
A few docs updates/notebooks!
@pipe_input improvements + refresher
Let me walk you through a quick example. @pipe_input allows you to process input data prior to running a function. In the following case, we have:
The change is that we now have the on_input parameter to allow it to specify which function parameter the pipe_input result gets applied to. If you leave it out, it will attempt to inject into the first parameter.
from hamilton.function_modifiers import pipe_input, step
import pandas as pd
def raw_features() -> pd.DataFrame:
return load(...)
@mutate(features)
def _normalize_columns(df: pd.DataFrame) -> pd.DataFrame:
for column in df.columns:
df[column] = (df[column]-df[column].min())/(df[column].max() - df[column].min())
return df
def _remove_outliers(df: pd.DataFrame, outlier_threshold: float) -> pd.DataFrame:
return df[df < outlier_threshold]
@pipe_input(
step(_normalize_columns),
step(_remove_outliers, outlier_threshold=value(10)),
on_input="raw_features" # inject post-processing into the raw_features parameter
)
def process_features(index_subset: pd.Index, raw_features: pd.DataFrame) -> pd.DataFrame:
return raw_features[index_subset] # raw_features is injected after being processed
Burr Release Highlights
Burr == 0.31.1
UI Fix:
Docker Files:
Office Hours & Meetup
Hamilton Meet up: Our next meet-up will be December. Want to present? Reach out. Otherwise join/sign-up here.
Hamilton Office Hours: They happen most Tuesday 9:30am PT - 10:30am PT.
Join our slack for the link.
Burr Office Hours: They happen most Wednesdays 9:30am PT - 10:3am PT.
Join our discord for the weekly link.
MLOps World & Generative AI World Summit 2024
This November is the annual MLOps & Generative AI World summit. It's in Austin, Texas. I went last year and had a great series of conversations with practitioners. If you can make it, I'd recommend attending.
For those that don't know, the goal of the summit/conference, organized by the Toronto Machine Learning Society (TMLS) , is to help companies put more machine learning and AI into production environments, effectively, responsibly, and efficiently.
Whether you're working towards a live production deployment, or currently working in production, this is a conference geared towards the gathering of like minded individuals to help share practical knowledge to help you on your journey.
Some of the talk tracks this year:
GenAI for SWEs Workshop
Together with Hugo Bowne-Anderson I will be hosting a workshop for software engineers on some first principles for delivering GenAI applications. More details to follow.
I'll also be running a community table on Hamilton & Burr, and "reliable AI" best practices.
Discount for Passes
If you'd like to attend, you can use the code DAGWORKS150 to get $150 off all passes. If you're going, send me a note, I'd love to meet-up.
Conference Details
When: 9AM ET on Thursday, November 7th to 5PM ET on Friday, November 8th 2024 Where: Renaissance Austin Hotel, 9721 Arboretum Boulevard, Austin, TX. MAP.
Need more convincing? Watch this video.
In the Wild:
Hamilton at PyConZA
Sholto Armstrong presented on their additions on top of Hamilton at PyConZA . This to our knowledge is the first user presentation on Hamilton at a conference that wasn't given by myself or Elijah ben Izzy ! ??
Hamilton Meet-up Recording
We had the meet-up this past week. Excellent engagement and a lot of questions and discussion around the main presentation, as well as the new features that shipped -- in particular caching! More details in the notes section.
Burr at DataForAI meetup in SF this month
I'm excited to present Burr at this month's DataForAI meetup. If you're in SF you should swing by! More details and link to the meet-up in Lindsey Robertson post below:
AI Pipelines for Healthcare Utilization Management
1 个月Burr with an LLM Avenger! Amazing!
Data and AI scientist, consultant. writer, educator, machine learner, podcaster.
1 个月shreya is a boss-level move, dude ??
Qǐyè jiā , Gùwèn
1 个月Re: Shreya that's awesome - congrats!