Week of May 6th
Image by author.

Week of May 6th

TL;DR:


Hamilton Release 1.61.0

New IPython Magic

There’s a new IPython Jupyter Magic (see our blog on building one ) — it allows you to incrementally create a module over multiple cells. To see it in action take a look at this notebook . It is very similar to the previous magic in functionality.

%%incr_cell_to_module doc_pipeline -i 1 --display
# module name, identifier to index this cell by, arguments for display        

For example the below screenshot shows us adding a few functions to an existing "module" and then seeing the produced graph over all of them. Notice that "raw_document" is in the graph, but it is not defined below, as it is defined in a prior cell.

Example showing the new magic in action.

No more dot files by default

When creating visualizations with Hamilton programmatically it used to create "dot files " as a byproduct. Now with 1.61.0 they are optional:

  • When creating static images with Hamilton, dot files are now not created. If you’d like to have them, set. keep_dot=True
  • If you were one of the few using the dot files, apologies, but feedback was overwhelmingly that no body wanted them by default.

New Examples

We added two new examples.

  1. That show cases an ML integration with the new Hamilton UI . You can watch my video walkthrough and using it as a way to explain some features in the new Hamilton UI.
  2. To go with my lightning session, we added a notebook that walks one through building a simple document processing pipeline to parse Hamilton's Sphinx documentation.


Lightning Session on Building a Document Processing Pipeline for RAG Systems

I had a lot of fun talking about and then engaging in questions with the audience hosted by Maven .

What I talked through.

You can watch the recording here (it's free, they just want an email for it).

You can also catch my slides here .

You can find the code we went over here .


Burr Release 0.14.1

A small release this past week. We pushed some extra validation when constructing your application:

  • Burr will now more clearly error if you typo'ed a function name when specifying halting conditions.

Burr Agent Design Notes

Otherwise an interesting note from conversations with users -- it seems like the following "multi-modal agent design" is becoming a good way to model more free flowing conversational agents:


Multi-modal or "tool using" agent design.

For example, if you have some goals to accomplish, you would track these via "state" that is passed between nodes. Then your application at each input cycle from the user, would determine what is the next best action to take, and then call the right "action" to do so.

Implications:

  • The control structure for this application is simple, as given the right state, it'll know what to do next.
  • This "action layer" of nodes could get pretty wide, so it'll be interesting to design ways to better manage and add to these.
  • They could for example even be dynamically added and removed...

Does this sound interesting to you? If so, reach out to chat, or join Burr's Discord server .



HeavyBit DevGuild: AI Summit II

LLM Evals & SLDC session I helped drive

I helped lead a session on "LLM Application Testing & Evaluation: Process, Tools, SDLC". It was a well attended session.

Key points:

  • Still early days for people getting to production. So lots of challenges, no consensus on the best way to do things yet.
  • No body likes their tools.

Things that are hard and people are dealing with:

  • curation of evaluation data is hard
  • things can take a long time, and could be costly, e.g. evaluating agents thoroughly.
  • choosing the right metrics can confusing. What do you focus on: model metrics or business metrics?
  • no consensus on tooling. Lots of vibe checks, etc.

Things to try if you're looking at LLM Evaluations:

  • use domain experts for data labeling, e.g. PMs.
  • Supplement with synthetic data / OS data sets if you don't have enough data,
  • caveat: your testing & evaluation approach will likely differ if you're using foundational APIs versus hosting and fine-tuning your own models.


US PyCon 2024

Just a reminder we'll be part of Start-Up Row at PyCon next week - we're excited to be in good company with dltHub , Exaloop , Pixee | Your Automated Product Engineer , Martian to name a few.

Start-ups at Start-Up Row

Please drop by our booth. We'll have stickers and would love to give you a demo.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了