Week of June 9th
Image by author.

Week of June 9th

TL;DR:


Hamilton Release 1.66.0

Highlights:

Hot on the heels of Databricks data & AI summit, we're pleased to announce some great support for MLflow .

MLFlow: MLflow is a project that is popular for storing things like model metrics and even model artifacts. With this new release you have:

  1. access to data savers & loaders that will log to and load from MLFlow.
  2. a tracker that will auto populate data for an MLFlow run given a Hamilton DAG run.

What this means is that you don’t have to include / integrate MLFlow directly into your Hamilton code, instead you couple it together at the Driver level. This enables a clean separation between logic and runtime concerns.

from hamilton import driver
from hamilton.io.materialization import to
from hamilton.plugins.h_mlflow import MLFlowTracker

dr = (
    driver.Builder()
    .with_modules(model_training_2)
    .with_adapters(MLFlowTracker()) # <- add this, it'll autolog to MLFlow
    .with_materializers(
        to.mlflow(
            id="trained_model__mlflow",
            dependencies=["trained_model"],
            register_as="my_new_model",
        ),
    )
    .build()
)        

For more details see this video overview and this tutorial notebook.

Hamilton UI Update:

Before you needed to have Docker installed to run the UI. Now you don’t!

  • pip install "sf-hamilton[ui]"

then

  • hamilton ui

to start it.

This should enable you to quickly and easily explore your Hamilton DAGs — just add the adapter to your driver (follow the instructions in the UI) and then it’ll log to it; you don’t need to execute it to be able to see it in the UI.

Fixes:

  • SDK: now how better guards around JSON serializable inputs.
  • Hamilton: fix for parallelizable . Thanks to Volker Lorrmann for raising.
  • Hamilton: Inputs can now be outputs, without them being defined in the DAG. Thanks to DS team at RTV EURO AGD for raising. E.g. this is useful if you want to pass in extra columns that you want to add to the output in the case of creating a pandas dataframe for example.

Examples / Documentation Updates:


WrenAI

We're excited that Hamilton is being picked up by another open source library. This time from Wren AI . They are building a RAG system and using Hamilton to help orchestrate it!

https://x.com/getwrenai/status/1798753120803340599

Blog: Lean Data Automation: A Principal Components Approach

Principal Components

This blog post was written in collaboration with Runhouse . In it we discuss that by unbundling the principle components of (macro) orchestrators, we can take advantage of a lean, cost-effective, and flexible stack. This can be done in such a way, i.e. by choosing the right tools, to preserve all the visibility, collaboration, and scale we need.

In the post we have a code example of this stack. It uses Github Actions, which is a free and widely available scheduler, and then combines using Hamilton with Runhouse, i.e. two open-source dedicated asset and infrastructure layers, to create this nimble and lean approach.

My main take away is that you can get pretty far before you have to reach for something like Airflow, Dagster, or Prefect, for data & ML work.


Blog: Traveling back in time with Burr

One of the best features of Burr is the ability to "fork state", i.e. given some application run and a point in time, copy that state into another application for you to debug/iterate with.

We wrote the hows and whys of this up in a post. That also comes along with a user contributed video on how they use this approach to develop with Burr! Thanks Ashis Ghosh !

The TL;DR: to enable it, is that you just need to pass in the write "IDs" to know where to take state from when building your application:

.initialize_from(
    state_persister,
    resume_at_next_action=True,
    default_state={"count" : 0},
    default_entrypoint="count",
    fork_from_app_id=PARENT_APP_ID,                                # <--
    fork_from_sequence_id=PARENT_APP_SEQUENCE_ID # <--
)        

Burr in Python Weekly

Burr made it onto the Python Weekly Newsletter - https://mailchi.mp/pythonweekly/python-weekly-issue-654

Always fun to see our projects get picked up onto various lists.


New Burr example: using it to power an OpenAI compatible endpoint.

Thierry Jean came up with a cute idea. There are many UIs that allow you to interface with an OpenAI compatible endpoint easily. Wouldn't it be nice to use one of them to interact with your Burr application?

Well we now have an example that precisely shows this.

The idea is simple - expose a FastAPI endpoint that mirrors the OpenAI endpoint. Then underneath, it delegates logic to Burr, which in turn can do whatever you want!

Here's a video walkthrough of it.


Hamilton OS Meetup Group

June meet-up is this coming week! Sign up here

New functionality:

  • Kedro Adapter
  • MLFlow Tracker
  • Locally running the Hamilton UI
  • The deep dive, will be an introduction on “How to use Hamilton in a RAG context”, e.g. for document ingestion.

Sign up here




Elijah ben Izzy

Co-creator of Hamilton/Burr OS libraries, Co-founder @ DAGWorks (YC W23, StartX S23)

9 个月

Really can't believe we got all this done this week

要查看或添加评论,请登录

Stefan Krawczyk的更多文章

  • February Updates

    February Updates

    TL;DR: #Hamilton highlights: crossed 2000 github stars, released multithreading based DAG parallelism, RichProgressBar…

    3 条评论
  • Last week of 2024 / first week of 2025

    Last week of 2024 / first week of 2025

    TL;DR: #Hamilton + #Burr 2024 stats: 35M+ telemetry events (10x), 100K+ unique IPs (10x) from 1000+ companies, 1M+…

    3 条评论
  • Week of December 9th

    Week of December 9th

    TL;DR: #Hamilton release highlights: Better TypedDict support and modular subdag example Office Hours & Meet ups for…

  • Week of December 2nd

    Week of December 2nd

    TL;DR: #Hamilton release highlights: Async Datadog Integration, Polars & Pandas with_columns support. #Burr release…

  • Week of November 18th

    Week of November 18th

    TL;DR: #Hamilton release highlights: SDK configurability #Burr release highlights: parallelism UI modifications, video…

  • Week of November 11th

    Week of November 11th

    TL;DR: #Hamilton release highlights: async support for @pipe + various small fixes #Burr release highlights:…

  • Week of November 4th

    Week of November 4th

    TL;DR: #Hamilton release highlights: @with_columns decorator for Pandas by Jernej Frank & module overrides for async…

  • Week of October 28th

    Week of October 28th

    TL;DR: #Hamilton release highlights: in-memory cache store. #Burr release highlights: release candidate for a first…

  • Week of October 21st

    Week of October 21st

    TL;DR: #Hamilton release highlights: some minor fixes and docs updates from five different OS contributors! Also…

  • Week of October 14th

    Week of October 14th

    TL;DR: Announcing Shreya Shankar as an advisor. #Hamilton release highlights: tweaks to pipe_input, new…

    3 条评论

社区洞察

其他会员也浏览了