登录查看更多内容

Week of May 6th

Stefan Krawczyk

CEO @ DAGWorks Inc. | Co-creator of Hamilton & Burr | Pipelines & Agents: Data, Data Science, Machine Learning, & LLMs

发布日期: 2024年5月8日

+ 关注

TL;DR:

Weekly Hamilton Release: `sf-hamilton==1.61.0`. Main highlight is a new IPython Jupyter magic.
Recording of my lightning document processing pipeline session is available on Maven to watch for free, along with my slides .
Weekly Burr Release: `burr==0.14.1`. Small release.
Some Burr agent design notes.
Heavybit DevGuild: AI Summit II - my notes on LLM Evals.
We're at PyCon next week. Come say hi if you're attending.

Hamilton Release 1.61.0

New IPython Magic

There’s a new IPython Jupyter Magic (see our blog on building one ) — it allows you to incrementally create a module over multiple cells. To see it in action take a look at this notebook . It is very similar to the previous magic in functionality.

%%incr_cell_to_module doc_pipeline -i 1 --display
# module name, identifier to index this cell by, arguments for display

For example the below screenshot shows us adding a few functions to an existing "module" and then seeing the produced graph over all of them. Notice that "raw_document" is in the graph, but it is not defined below, as it is defined in a prior cell.

Example showing the new magic in action.

No more dot files by default

When creating visualizations with Hamilton programmatically it used to create "dot files " as a byproduct. Now with 1.61.0 they are optional:

When creating static images with Hamilton, dot files are now not created. If you’d like to have them, set. keep_dot=True
If you were one of the few using the dot files, apologies, but feedback was overwhelmingly that no body wanted them by default.

New Examples

We added two new examples.

That show cases an ML integration with the new Hamilton UI . You can watch my video walkthrough and using it as a way to explain some features in the new Hamilton UI.
To go with my lightning session, we added a notebook that walks one through building a simple document processing pipeline to parse Hamilton's Sphinx documentation.

Lightning Session on Building a Document Processing Pipeline for RAG Systems

I had a lot of fun talking about and then engaging in questions with the audience hosted by Maven .

You can watch the recording here (it's free, they just want an email for it).

You can also catch my slides here .

You can find the code we went over here .

Burr Release 0.14.1

A small release this past week. We pushed some extra validation when constructing your application:

Yevhen Kralych 1 年前

A complete introduction to Plotly, from beginner to…

Pranjall Kumar 3 年前

Handling missing values in time series

Gustavo Sánchez Hurtado 1 年前

Burr will now more clearly error if you typo'ed a function name when specifying halting conditions.

Burr Agent Design Notes

Otherwise an interesting note from conversations with users -- it seems like the following "multi-modal agent design" is becoming a good way to model more free flowing conversational agents:

Multi-modal or "tool using" agent design.

For example, if you have some goals to accomplish, you would track these via "state" that is passed between nodes. Then your application at each input cycle from the user, would determine what is the next best action to take, and then call the right "action" to do so.

Implications:

The control structure for this application is simple, as given the right state, it'll know what to do next.
This "action layer" of nodes could get pretty wide, so it'll be interesting to design ways to better manage and add to these.
They could for example even be dynamically added and removed...

Does this sound interesting to you? If so, reach out to chat, or join Burr's Discord server .

HeavyBit DevGuild: AI Summit II

I helped lead a session on "LLM Application Testing & Evaluation: Process, Tools, SDLC". It was a well attended session.

Key points:

Still early days for people getting to production. So lots of challenges, no consensus on the best way to do things yet.
No body likes their tools.

Things that are hard and people are dealing with:

curation of evaluation data is hard
things can take a long time, and could be costly, e.g. evaluating agents thoroughly.
choosing the right metrics can confusing. What do you focus on: model metrics or business metrics?
no consensus on tooling. Lots of vibe checks, etc.

Things to try if you're looking at LLM Evaluations:

use domain experts for data labeling, e.g. PMs.
Supplement with synthetic data / OS data sets if you don't have enough data,
caveat: your testing & evaluation approach will likely differ if you're using foundational APIs versus hosting and fine-tuning your own models.

US PyCon 2024

Just a reminder we'll be part of Start-Up Row at PyCon next week - we're excited to be in good company with dltHub , Exaloop , Pixee | Your Automated Product Engineer , Martian to name a few.

Please drop by our booth. We'll have stickers and would love to give you a demo.

Week of May 6th

Stefan Krawczyk

CEO @ DAGWorks Inc. | Co-creator of Hamilton & Burr | Pipelines & Agents: Data, Data Science, Machine Learning, & LLMs

TL;DR:

Hamilton Release 1.61.0

New IPython Magic

No more dot files by default

New Examples

Lightning Session on Building a Document Processing Pipeline for RAG Systems

Burr Release 0.14.1

领英推荐

Burr Agent Design Notes

HeavyBit DevGuild: AI Summit II

US PyCon 2024

Stefan's Weekly Updates

719 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

?? Creating Interactive Dashboards with Plotly and Dash ??

Vectorizing Functions in NumPy

Data Visualization with Matplotlib and Seaborn

Matplotlib and Seaborn: Visualizing Data with Impact

Pip install autoroc

Creating a Bar Plot with Seaborn

Customizing Plot Aesthetics in Seaborn

Beginner’s Guide to Visualizations with Matplotlib

How to Create a Box Plot with Seaborn

TL;DR:

Hamilton Release 1.61.0

New IPython Magic

No more dot files by default

New Examples

Lightning Session on Building a Document Processing Pipeline for RAG Systems

Burr Release 0.14.1

领英推荐

Burr Agent Design Notes

HeavyBit DevGuild: AI Summit II

US PyCon 2024

Stefan's Weekly Updates

719 位关注者

Week of November 18th

2024年11月22日

Week of November 11th

2024年11月15日

Week of November 4th

2024年11月8日

Week of October 28th

2024年10月31日

Week of October 21st

2024年10月24日

Week of October 14th

2024年10月17日

Week of October 7th

2024年10月11日

September 30th

2024年10月3日

Week of September 23rd

2024年9月27日

Week of September 16th

2024年9月19日

社区洞察

其他会员也浏览了

?? Creating Interactive Dashboards with Plotly and Dash ??

Vectorizing Functions in NumPy

Data Visualization with Matplotlib and Seaborn

Matplotlib and Seaborn: Visualizing Data with Impact

Pip install autoroc

Creating a Bar Plot with Seaborn

Customizing Plot Aesthetics in Seaborn

Beginner’s Guide to Visualizations with Matplotlib

How to Create a Box Plot with Seaborn