Week of July 15th
Image by author.

Week of July 15th

TL;DR:

  • #Hamilton release highlights: User contributed Graceful Failure Adapter improvements, SparkConnect support, Async updates & upgrades, new UI Schema view.
  • Hamilton OS Meetup Group: August scheduled.
  • #Burr release highlights: new GraphBuilder API, SERDE specification for test case creation, adds adaptive CRAG example, adds FastAPI async streaming example with Server-Sent-Events + React .
  • In the wild: A Hamilton user blog


Hamilton Release Notes:

Hamilton Framework == 1.71.0, 1.70.0

User Contributed Graceful Failure Adapter Improvements

A few weeks ago we added a new feature, the ability to run a Hamilton DAG all the way through even though error occurred. Last week, an open source user (thanks James Arruda !) added some cool new capabilities to this adapter.

The high-level is that you can use it to not fail the DAG if an upstream node fails, and instead bypass all downstream nodes. Here’s a simple example — you define an error to catch (so you don’t catch everything), as well as a sentinel value that will get cascaded through. It will continue as normal, but if it detects that an upstream node has failed, it will fail in itself.

# my_module.py

class DoNotProceed(Exception):
    pass # custom exception

def wont_proceed() -> int:
    raise DoNotProceed()

def will_proceed() -> int:
    return 1

def never_reached(wont_proceed: int) -> int:
    return 1  # this should not be reached

# your driver code:
dr = (
    driver.Builder()
    .with_modules(my_module)
    .with_adapters(
        default.GracefulErrorAdapter(
            error_to_catch=DoNotProceed,
            sentinel_value=None
        )
    )
    .build()
)
# will return {'will_proceed': 1, 'never_reached': None}
dr.execute(["will_proceed", "never_reached"])          

The new features added now enable it to work with the `Parallel[]/Collect[…]` constructs, and has a few more toggles - see the documentation for details. For example, a new decorator `@accept_error_sentinels` was added, that allows you to pass in sentinel "error value" to a function, and handle the error in your own way in a function. Thanks James Arruda !

Spark Connect Support

Databricks recently pushed out some changes where the "SparkSession" class used is different in a "Spark Connect" context. What this meant is that Hamilton's type checking would fail and complain. Databricks plans to unify the classes, but that wont happen for a while. So in the meantime we've added an adapter that can help you out. To use it you'd just do:

from hamilton import driver
from hamilton.plugins import h_spark

dr = (
  driver.Builder()
     .with_modules(...)
      # add the adapter if you're using Hamilton with Spark Connect.
     .with_adapters(h_spark.SPARK_INPUT_CHECK)
     .build()
)        

Async Upgrades & Updates

  1. Thanks to Ryan Whitten for finding some ??s. We've upgraded the Async Builder and Driver.
  2. The AsyncBuilder can now construct an AsyncDiver in a synchronous fashion, i.e. no await needed. Just use the build_without_init() function:

def build_without_init(self) -> AsyncDriver:        

Hamilton SDK & UI

We've added improved capture of schema metadata and extra metadata that can be captured. This required some SDK and UI work. So now, for example, when you run say a PySpark job with the HamiltonTracker, you'll get a nice schema view, and way to explore

and

Examples / Documentation Updates:


Hamilton OS Meetup Group

Reminder there's no meet-up in July. But we have August scheduled. Join/sign-up here. We're excited to have Gilad Rubin speak about some of the work he's been doing on Hamilton.


Burr Release Updates ??

Burr == 0.23.0

GraphBuilder API

In an effort to streamline the API, we've given the ability to separate the graph definition from the application definition, specifically creating a GraphBuilder API. This allows one to clearly construct the graph once, and then reference/refer to it as needed.

base_graph = (
    graph.GraphBuilder()
    .with_actions(
        # your actions go here
    )
    .with_transitions(
        # transitions go here
    )
    .build()
)
# then you can build an application like this
app = (
        ApplicationBuilder()
        .with_graph(base_graph) # <--- this is where you add the graph
        .with_tracker(tracker)
        .with_identifiers(app_id=app_id)
        .build()
    )        

For a full example, see it in action here.

SERDE Handling for Test Case Creation

Thanks to Rinat Gareev for find the bug, but we pushed a fix to enable serialization and deserialization updates to Burr's test case creation capability. It now properly handles custom serialization/deserialization that Burr enables.

More Burr Examples

We've added two new examples:

  • A Corrective RAG Example - thanks to Hamza Farhan for adding it!
  • New examples with server-sent-events, fastapi, and react to build a streaming chat app

Corrective RAG

Corrective-RAG (CRAG) is a strategy for RAG that incorporates self-reflection / self-grading on retrieved documents. In this example we show how you can build an application with Burr, using LanceDB as the vector store, Exa as the search engine, Instructor by Jason Liu , and 谷歌 's Gemini.

the application built in the example


Streaming Chatbot with Burr, FastAPI, and React

We're excited by this example and accompanying blog post, as it's a great overview and introduction to a few things, for example async, streaming, and server-sent-events.

Example code snippet in the blog that explains how to create a streaming endpoint

We've seen a hunger for this type of content, so we're working on adding more.


Seen in the wild: a Hamilton User Blog

It's always fun to receive word when someone writes about Hamilton. This time we had a user Carl Trachte , who stopped by our booth at PyCon, write about his first experience picking up Hamilton doing some processing; it's one way he internalizes tools is that he writes about them.

It's a short read, and what I like the most is how straightforward it is to read and understand his code. Thanks Carl!

Carl's Hamilton DAG.



要查看或添加评论,请登录

Stefan Krawczyk的更多文章

  • February Updates

    February Updates

    TL;DR: #Hamilton highlights: crossed 2000 github stars, released multithreading based DAG parallelism, RichProgressBar…

    3 条评论
  • Last week of 2024 / first week of 2025

    Last week of 2024 / first week of 2025

    TL;DR: #Hamilton + #Burr 2024 stats: 35M+ telemetry events (10x), 100K+ unique IPs (10x) from 1000+ companies, 1M+…

    3 条评论
  • Week of December 9th

    Week of December 9th

    TL;DR: #Hamilton release highlights: Better TypedDict support and modular subdag example Office Hours & Meet ups for…

  • Week of December 2nd

    Week of December 2nd

    TL;DR: #Hamilton release highlights: Async Datadog Integration, Polars & Pandas with_columns support. #Burr release…

  • Week of November 18th

    Week of November 18th

    TL;DR: #Hamilton release highlights: SDK configurability #Burr release highlights: parallelism UI modifications, video…

  • Week of November 11th

    Week of November 11th

    TL;DR: #Hamilton release highlights: async support for @pipe + various small fixes #Burr release highlights:…

  • Week of November 4th

    Week of November 4th

    TL;DR: #Hamilton release highlights: @with_columns decorator for Pandas by Jernej Frank & module overrides for async…

  • Week of October 28th

    Week of October 28th

    TL;DR: #Hamilton release highlights: in-memory cache store. #Burr release highlights: release candidate for a first…

  • Week of October 21st

    Week of October 21st

    TL;DR: #Hamilton release highlights: some minor fixes and docs updates from five different OS contributors! Also…

  • Week of October 14th

    Week of October 14th

    TL;DR: Announcing Shreya Shankar as an advisor. #Hamilton release highlights: tweaks to pipe_input, new…

    3 条评论

社区洞察

其他会员也浏览了