Week of September 16th
Image by author.

Week of September 16th

TL;DR:

  • Social Proof: Don't take my word for it, take theirs.
  • #Hamilton release highlights: OS contributed Pydantic data validator, & new function overrides capability.
  • #Burr release highlights: Pydantic schema for state
  • Blog on OpenLineage integration with Hamilton
  • Office Hours & Meet ups for Hamilton & Burr.


Social Proof: Don't take my word for it, take theirs.

It's been a fun past week asking/collecting user for quotes on their experience using Hamilton and Burr. If you're a lurker and haven't yet tried what we're building, there's never been a better time to start than now... Don't take my word for it - take theirs! Here's some teasers:

[...] I felt trapped in LangChain's ecosystem [...] Moving from LangChain to Burr was a game-changer.
It took me just a few hours to get started with Burr, compared to the days and weeks I spent trying to navigate LangChain.
With Burr, I could finally have a cleaner, more sophisticated, and stable implementation. No more wrestling with complex codebases.
I pitched Burr to my teammates, and we pivoted our entire codebase to it. It's been a smooth ride ever since.
Hamilton is simplicity. Its declarative approach to defining pipelines (as well as the UI to visualize them) makes testing and modifying the code easy, and onboarding is quick and painless. Since using Hamilton, we have improved our efficiency of both developing new functionality and onboarding new developers to work on the code. We deliver solutions more quickly than before.
We're active users of Hamilton. We have found it very useful in standardizing feature engineering code in our production code bases. It's particularly useful in diagnosing data contamination issues, and it's used by all of our MLEs
Of course, you can use it [LangChain], but whether it's really production-ready and improves the time from "code-to-prod" [...], we've been doing LLM apps for two years, and the answer is no [...] All these "all-in-one" libs suffer from this [...]? Honestly, take a look at Burr. Thank me later.
How (with good software practices) do you orchestrate a system of asynchronous LLM calls, but where some of them depend on others? How do you build such a system so that it’s modular and testable? At [REDACTED] we’ve selected Hamilton to help us solve these problems and others. And today our product, [REDACTED], an AI legal assistant that extracts information from estate planning documents, is running in production with Hamilton under the hood.

Hamilton Release Highlights:

Hamilton Framework == 1.77.0

Pydantic data validators

Pydantic is a common library used to describe data records, especially in web contexts. With this latest release you can now validate any pydantic model/dict with model contents against a schema using the check_output decorator:

from hamilton.function_modifiers import check_output

class MyModel(BaseModel):
    name: str

@check_output(model=MyModel)
def foo() -> dict:
    return {"name": "hamilton"}

# or
from hamilton.plugins import h_pydantic

@h_pydantic.check_output()
def foo() -> MyModel:
    return MyModel(name="hamilton")        

For those unfamiliar with the check_output decorator, it is a lightweight way to incorporate data quality with Hamilton, and this Pydantic integration complements our Pandera integration. Thanks to Charles Schwartz for their second contribution to the project!

New function overrides

When creating multiple Hamilton modules and then using them together, there are situations where the @config.when decorator usage can be a little too verbose when you want to replace logic depending on what modules you are using together. Thanks to Jernej Frank , you now have another option (shout out to Yijun Tang for the idea).

Here's how it works - say you have two modules, where one function is redefined in the other:

# module_a.py
def foo() -> int:
    return 1

def bar() -> int:
    return 2        
# module_b.py
def bar() -> int:
    return 3        

Rather than doing @config.when to choose the right one, we can just tell Hamilton (via .all_module_overrides()) to take the last definition we come across:

import module_1, module_b

dr = (
    driver
    .Builder()
    .with_modules(module_a, module_b) # order matters!
    .allow_module_overrides() # < --- this is required for it to work
    .build()
)

print(dr.execute(['foo', 'bar']))
{
    "foo" : 1,
    "bar" : 3
}        

Hamilton SDK == 0.7.2

??Fix: We pushed a few Polars fixes so summary statistics work appropriately when logging to the Hamilton UI.

Hamilton UI == 0.0.15

??Fix: if quantile values are None or empty, the UI now correctly handles them.


Burr Release Highlights

Burr == 0.30.1

Typed State with Pydantic (docs reference here, example here)

Burr now has two approaches to specifying a schema for a state. These can work together as long as they specify clashing state:

  • Application-level typing
  • Action-level typing

These enable a host of other extensions/capabilities.

While the current implementation only supports Pydantic, the typing system is intended to be pluggable, and we plan to add further integrations (dataclasses, typed dicts, etc…).

The TL;DR: is that you can now do something like this at the application level:

First, define a Pydantic model for your application:

from pydantic import BaseModel
class ApplicationState(pydantic.BaseModel):
    chat_history: List[dict[str, str]] = pydantic.Field(default_factory=list)
    prompt: Optional[str] = None
    mode: Optional[Literal["text", "image"]] = None
    response: Optional[dict[str, str]] = None
        

Then, we can use this model to type our application:

from burr import ApplicationBuilder
from burr.core.typing import PydanticTypingSystem

app = (
    ApplicationBuilder()
    .with_actions(...)
    .with_entrypoint(...)
    .with_transitions(...)
    .with_typing(PydanticTypingSystem(ApplicationState))
    .with_state(ApplicationState())
    .build()
)
        

Your application is now typed with that pydantic model. If you’re using an appropriate typing integration in your IDE (E.G. pylance), it will know that the state of your application is of type MyApplicationState.

When you have this you’ll be able to run:

action_ran, result, state = app.run(inputs=...)
state.data # of type ApplicationState -- do what you want with this!        

For action level state:

You can also define type computations on on the action-level:

@action.pydantic(reads=["prompt", "chat_history"], writes=["response"])
def image_response(state: ApplicationState, 
                                     model: str = "dall-e-2") -> ApplicationState:
    client = _get_openai_client()
    result = client.images.generate(
        model=model, prompt=state.prompt, 
        size="1024x1024", quality="standard", n=1
    )
    response = result.data[0].url
    state.response = {"content": response, 
                                   "type": MODES[state.mode], 
                                   "role": "assistant"}
    return state
        

Note three interesting choices here:

  1. The state is typed as a Pydantic model
  2. The return type is the same Pydantic model
  3. We mutate the state in place, rather than returning a new state

This is a different action API – it effectively subsets the (global) state on input, gives you that Pydantic object, then subsets the state on output, and merges it back.

Thus if you try to refer to a state variable that you didn’t specify in the reads/writes, it will give an error.

Mutating in place is OK as this produces a new object for each execution run. For now, you will want to be careful about lists/list pointers – we are working on that.

For more details see documentation here, and the example here.


New Blog


Hamilton & OpenLineage

Title: Hamilton supports OpenLineage

This post follows last week's OpenLineage meetup where we presented Hamilton and it's new integration with OpenLineage. This blog post goes over what OpenLineage is, why you might want to use it, and then how Hamilton now emits OpenLineage events. For example to get more visibility into your Airflow jobs that run python, consider using Hamilton to help organize that and then with OpenLineage provide data lineage for those tasks!


Office Hours & Meetup

Hamilton Meet up: We'll have the next meet up in October. Currently it is scheduled for October 15th. We're excited to have Sholto Armstrong talk about their use of Hamilton and the new library they built on top at Capitec . Join/sign-up here.

Hamilton Office Hours: They happen most Tuesday 9:30am PT - 10:30am PT. Join our slack for the link.

Burr Office Hours: They happen most Wednesdays 9:30am PT - 10:3am PT. Join our discord for the weekly link.




Raja Rajavel

Founder, Lattek

6 个月

Voice of the user is sweet music!

回复

要查看或添加评论,请登录

Stefan Krawczyk的更多文章

  • February Updates

    February Updates

    TL;DR: #Hamilton highlights: crossed 2000 github stars, released multithreading based DAG parallelism, RichProgressBar…

    3 条评论
  • Last week of 2024 / first week of 2025

    Last week of 2024 / first week of 2025

    TL;DR: #Hamilton + #Burr 2024 stats: 35M+ telemetry events (10x), 100K+ unique IPs (10x) from 1000+ companies, 1M+…

    3 条评论
  • Week of December 9th

    Week of December 9th

    TL;DR: #Hamilton release highlights: Better TypedDict support and modular subdag example Office Hours & Meet ups for…

  • Week of December 2nd

    Week of December 2nd

    TL;DR: #Hamilton release highlights: Async Datadog Integration, Polars & Pandas with_columns support. #Burr release…

  • Week of November 18th

    Week of November 18th

    TL;DR: #Hamilton release highlights: SDK configurability #Burr release highlights: parallelism UI modifications, video…

  • Week of November 11th

    Week of November 11th

    TL;DR: #Hamilton release highlights: async support for @pipe + various small fixes #Burr release highlights:…

  • Week of November 4th

    Week of November 4th

    TL;DR: #Hamilton release highlights: @with_columns decorator for Pandas by Jernej Frank & module overrides for async…

  • Week of October 28th

    Week of October 28th

    TL;DR: #Hamilton release highlights: in-memory cache store. #Burr release highlights: release candidate for a first…

  • Week of October 21st

    Week of October 21st

    TL;DR: #Hamilton release highlights: some minor fixes and docs updates from five different OS contributors! Also…

  • Week of October 14th

    Week of October 14th

    TL;DR: Announcing Shreya Shankar as an advisor. #Hamilton release highlights: tweaks to pipe_input, new…

    3 条评论

社区洞察

其他会员也浏览了