The Elmer Project, New Shiny Release for Python, Mastering NLP from Foundations to LLMs
Rami Krispin
Senior Manager - Data Science and Engineering at Apple | Docker Captain | LinkedIn Learning Instructor
This week's agenda:
I am also on ???? Blue Sky ??
Open Source of the Week
Here are a couple of interesting projects I came across this week.
The Elmer Project
The Elmer is a new R library that provides a user friendly wrapper over core LLM frameworks. It supports LLM frameworks such as OpenAI ChatGPT, Anthropic Claude, Snowflake Cortex, Google Gemini, etc. This project is part of the Tidyverse framework and it currently at an early stage. It supports core LLM's functionality such as streaming and async APIs and text summariztion.
This project is a collaboration between Hadley Wickham and Joe Cheng . You can find more details about the project on Hadley's post and in the project documentation:
Shiny 1.2 for Python
Another announcement this week from the Posit PBC was the release of Shiny version 1.2 for Python. The main feature of this release is the integration of the Python narwhals library, which provides a unifying layer for working with different data frame objects (more details below). The following short video provides a more detailed explanation.
More details are available in the library release notes.
The Narwhals Library
Thanks to the above Shiny release, I learned about the narwhals Python library. This library provides a lightweight supporting layer between different Python dataframes libraries such as Pandas, Polars, Arrow, Dast, etc.
You can find more details in the project documentation:
New Learning Resources
Here are some new learning resources that I came across this week.
GitHub Universe 2024
For those who missed GitHub Universe 2024, the annual GitHub developer conference, all the talks are now available online. This year, the conference was heavily focused on LLM and AI applications. Thanks to the GitHub team for making the talks available online.
领英推荐
PyData Amsterdam 2024
All the talks from the recent PyData Amsterdam 2024 conference are now available to watch online. This includes great talks about machine learning, data engineering, data visualization, AI, etc. Thanks to the PyData Amsterdam team for making the talks available online.
One talks I watched so far and I highly recommend if you are in the domain of time series forecasting and Bayesian stats is Dr. Juan Camilo Orduz talk - Time Series forecasting with NumPyro:
You can learn more about the NumPyro Python library in?edition 8?of this newsletter.
Book of the Week
This week's spotlight is the Mastering NLP from Foundations to LLMs by Lior Gazit and Meysam Ghaffari, Ph.D. . The book focuses, as the name implies, on the foundation of NLP and LLM modeling, and it covers the following topics:
The book is for people who are interested in starting with NLP and those who wish to explore LLM applications. The book is available to purchase on the publisher's website and Amazon:
Have any questions? Please comment below!
See you next Tuesday!
Thanks,
Rami
Machine Learning Group Manager
4 个月Great overview Rami!
Driving Advanced Analytics & Digital Transformation in Audit & Assurance | Expertise in Continuous Auditing, Fraud Analytics & Automation | xPTCL & Ufone (e& UAE) | Data Science - Agentic AI - Machine Learning - GenAI
4 个月Exciting updates in this edition! I'm particularly interested in the Elmer project open source tools are vital for community collaboration.
Associate Prof, University of Ottawa
4 个月Rami Krispin - your newsletter is a true public good. Thanks for the time you put to assemble all this information. And this week’s recommended book on NLP and LLM foundations is indeed a great read that I bought when it was first released.