Week of October 21st
Stefan Krawczyk
CEO @ DAGWorks Inc. | Co-creator of Hamilton & Burr | Pipelines & Agents: Data, Data Science, Machine Learning, & LLMs
TL;DR:
Hamilton Release Highlights:
Hamilton Framework == 1.81.1
A smaller release this week, but all contributions this week were from open source users! ?? Thanks you!
?? Fixes:
Documentation & Examples:
Hamilton Caching Video Overview
Super excited to get a walkthrough introductory video of how Hamilton's caching works by Thierry Jean . We're already getting interest from folks wanting to use this in production at large companies. Do reach out if you want to try this out to help optimize and cut some development & production costs!
Burr Release Highlights
Burr == 0.32.0
HaystackActions
Haystack is an off-the-shelf "giga-library" with many implementations for various retrieval augmented generation (RAG) components. It's a nice place to start for projects. However, your mileage may vary as you come to productionize and customize for your needs. Since neither Hamilton & Burr come with pre-built "components", and we have the best frameworks for productionization & customization, we thought we'd provide a bridge experience to help people get started and then easily transition to customization and a great developer UX with the Burr UI.
So in this release we're adding a HaystackAction. This is a way to easily turn any "Haystack component" into a Burr Action. That way you can benefit from:
To give a taste of the code, here are two equivalent ways to make use of Haystack:
from burr.integrations.haystack import HaystackAction
@action(reads=["query_embedding"], writes=["documents"])
def retrieve_documents(state: State) -> State:
query_embedding = state["query_embedding"]
document_store = InMemoryDocumentStore()
retriever = InMemoryEmbeddingRetriever(document_store)
results = retriever.run(query_embedding=query_embedding)
return state.update(documents=results["documents"])
# this is the new feature:
haystack_retrieve_documents = HaystackAction(
component=InMemoryEmbeddingRetriever(InMemoryDocumentStore()),
name="retrieve_documents",
reads=["query_embedding"],
writes=["documents"],
)
The bottom shows how you can define a Burr Action that takes a Haystack component directly. Or if you already have a Haystack pipeline you can convert it to a Burr Graph via -- which you can then use in a nested manner or directly:
from burr.integrations.haystack import haystack_pipeline_to_burr_graph
haystack_graph = haystack_pipeline_to_burr_graph(basic_rag_pipeline)
app = (
ApplicationBuilder()
.with_graph(haystack_graph)
.with_entrypoint("prompt_builder")
.build()
)
app.visualize(include_state=True)
This set up then enables you to change & swap the implementation of each action without buying into any framework. For example, commonly we see people doing more complex logic with Hamilton, where they originally started with something simpler.
To see a tutorial of how this works, see this notebook.
Office Hours & Meetup
Hamilton Meet up: Our next meet-up will be December. Want to present? Reach out. Otherwise join/sign-up here.
Hamilton Office Hours: They happen most Tuesday 9:30am PT - 10:30am PT.
Join our slack for the link.
Burr Office Hours: They happen most Wednesdays 9:30am PT - 10:3am PT.
Join our discord for the weekly link.
MLOps World & Generative AI World Summit 2024
This November is the annual MLOps & Generative AI World summit. It's in Austin, Texas. I went last year and had a great series of conversations with practitioners. If you can make it, I'd recommend attending.
For those that don't know, the goal of the summit/conference, organized by the Toronto Machine Learning Society (TMLS) , is to help companies put more machine learning and AI into production environments, effectively, responsibly, and efficiently.
Whether you're working towards a live production deployment, or currently working in production, this is a conference geared towards the gathering of like minded individuals to help share practical knowledge to help you on your journey.
Some of the talk tracks this year:
GenAI for SWEs Workshop
Together with Hugo Bowne-Anderson I will be hosting a workshop for software engineers on some first principles for delivering GenAI applications. More details to follow.
I'll also be running a community table on Hamilton & Burr, and "reliable AI" best practices.
Discount for Passes
If you'd like to attend, you can use the code DAGWORKS150 to get $150 off all passes. If you're going, send me a note, I'd love to meet-up.
Conference Details
When: 9AM ET on Thursday, November 7th to 5PM ET on Friday, November 8th 2024 Where: Renaissance Austin Hotel, 9721 Arboretum Boulevard, Austin, TX. MAP.
Need more convincing? Watch this video.
Blog Post:
Building Reliable AI: Annotating Data using Burr
We're excited to continue showcasing a better software development lifecycle (SDLC) with Burr for building anything agent like. In this post we overview the features and tactics that you can use to more quickly improve your agents; after all iteration speed matters when building on top of "non-determinism" (i.e. LLM calls). To get a handle on that "non-determinism", you need observability and then the workflow to do something with it. This is what we're building and show case some of here in this post:
In the Wild:
Burr + Instructor post
We're excited to get a post on using Instructor & Burr for generating flashcards. It showcases how you can use both libraries to quickly create an agentic human-in-the-loop application:
Thanks Thierry Jean and Jason Liu !
Burr at DataForAI meetup in SF this month
I'm excited to present Burr at this month's DataForAI meetup. If you're in SF you should swing by! To sign up: