登录查看更多内容

Issue #325 - The ML Engineer ??

Alejandro Saucedo

AI & Data Executive @ Zalando | Advisor @ UN, EU, ACM, etc | Join 70k+ ML Newsletter

发布日期: 2025年3月9日

Thank you for being part of over 70,000+ ML professionals and enthusiasts who receive weekly articles & tutorials on Machine Learning & MLOps ?? You can join the newsletter for free at https://ethical.institute/mle.html ?

If you like the content please support the newsletter by sharing with your friends via ?? Email, ?? Twitter, ?? Linkedin and ?? Facebook!

This week in Machine Learning:?

Karpathy's Tips on LLMs
China's New Autonomous AI Agent
Reading Papers for Engineers
MIT Distributed Systems Course
Difussion Training in Micro-Budget
Open Source ML Frameworks
Awesome AI Guidelines to check out this week
+ more ??

If you're looking for an interesting career opportunity, I'm ?hiring for a few roles including Data Science Manager (Forecasting), as well as Principal Product Manager (Forecasting) - check them out and please do share with your network!

Karpathy's Tips on LLMs?

Andrej Karpathy has dropped a two-hour walkthrough on how he uses LLMs for day-to-day activities and this is a masterclass on productivity: Karpathy dives into all the features available in today's LLM applications (across Grok/Gemini/ChatGPT), including core prompt engienering tips, nuances between models and really solid insights on some of the addons and external applicaitions. This also covers some of the more advanced / experimental features like deep research, intenet search, code interpreters, and how each service compares to each other. It is quite interesting to see this analysis from quite a pragmatic perspective, as well as compare some of the workflows that I have in place vs tools available - there is no doubt that there is quite a lot of potential, but a lot of these capabilities are still in their nascent state with a lot more opportunity ahead!

China's New Autonomous AI Agent

China innovation keep surprising every week, this time with a new autonomous AI agent that can basically control your phone to deliver complex UI-driven tasks: Manus AI is a new multi-agent system which promises to deliver a task assistant that delivers complex UI tasks through several specialized models that have access to asynchronous cloud execution. It is quite interesting to see some pretty intuitive live demos of this framework performing automated resume screening, market research, data analysis, and even web deployment. Looking at the growing activity in this space it is clear that multi-agent services will only continue to blow our minds with ingenious releases this year.

Reading Papers for Engineers

A software engineer's guide to reading research papers: For any software and ML practitioners reading not just ML but also computer science papers can help drive your career to the next level! The approach to read research papers is always to start with a quick skim read focusing on the abstract, intro, results, and conclusion to determine if the paper is relevant. The next step is to do a deeper dive by identifying flag unfamiliar terms, and capturing key references to build the relevant context to really understand what is being presented. Finally we read it a final time to connect the dots and clarify challenging concepts that may have been abstract on the first read. This is a great premier on research papers, and definitely recommended for any practitioners in the software or ML space!

MIT Distributed Systems Course

MIT has released one of the best university-level courses on Distributed Systems, diving into the tech giant's top large-scale systems, as well as diving into foundational concepts in that are relevant even for ML systems as well: This free MIT course on distributed systems dives into the building blocks behind scalable fault-tolerant infrastructures, covering everything from the fundamentals (RPC, threading, etc) to advanced topics like consensus algorithms (Raft), distributed transactions, and optimistic concurrency control. There's often recommendations to read top compsci papers on distributed systems, but this course actually dives into some of them, including the Google File System (GFS), Zookeeper, Aurora, Spanner, and even exploring cache consistency in environments like Facebook’s Memcached. This is probably one of the best free courses on distributed systems out there, definitely recommend checking it out.

Difussion Training in Micro-Budget?

Every day we see ML models are growing higher and requiring more compute, however there are similar innovations to reduce the cost and compute required; this repo covers how to train a diffusion model in a micro-budget: This project introduces a clever and cost-effective strategy for training large-scale diffusion models by using deferred patch masking, which basically starts with low-resolution images where 75% of patches are masked to save compute, then gradually unmasking and fine-tuning at higher resolutions. This approach lets us train a 1.16-billion-parameter sparse transformer on only 37 million real and synthetic images for just $1,890 while still achieving state-of-the-art competitive results. This may still sound like a sizeable amount however it actually is actually surprisingly efficient, and at this pace we will end up being able to train these type of models in commodity day-to-day hardware like our own personal computers.

Upcoming MLOps Events

The MLOps ecosystem continues to grow at break-neck speeds, making it ever harder for us as practitioners to stay up to date with relevant developments. A fantsatic way to keep on-top of relevant resources is through the great community and events that the MLOps and Production ML ecosystem offers. This is the reason why we have started curating a list of upcoming events in the space, which are outlined below.

Upcoming conferences where we're speaking:

WeAreDevelopers 2025 - 9th July @ Berlin

Other upcoming MLOps conferences in 2025:

ODSC East - May 13 @ Boston
Data & AI Summit - 9th June @ San Francisco
Data & AI Summit - 10th June @ USA
AI Summit London - 12th June @ UK
World Summit AI Europe - 08 Oct @ Amsterdam
MLOps World - Oct 8-9 @ Austin

Open Source MLOps Tools

Check out the fast-growing ecosystem of production ML tools & frameworks at the github repository which has reached over 10,000 ? github stars. We are currently looking for more libraries to add - if you know of any that are not listed, please let us know or feel free to add a PR. Four featured libraries in the GPU acceleration space are outlined below.

Kompute - Blazing fast, lightweight and mobile phone-enabled GPU compute framework optimized for advanced? data processing usecases.
CuPy - An implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it.
Jax - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
CuDF - Built based on the Apache Arrow columnar memory format, cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data.

If you know of any open source and open community events that are not listed do give us a heads up so we can add them!

OSS: Policy & Guidelines?

As AI systems become more prevalent in society, we face bigger and tougher societal challenges. We have seen a large number of resources that aim to takle these challenges in the form of AI Guidelines, Principles, Ethics Frameworks, etc, however there are so many resources it is hard to navigate. Because of this we started an Open Source initiative that aims to map the ecosystem to make it simpler to navigate. You can find multiple principles in the repo - some examples include the following:

MLSecOps Top 10 Vulnerabilities - This is an initiative that aims to further the field of machine learning security by identifying the top 10 most common vulnerabiliites in the machine learning lifecycle as well as best practices.
AI & Machine Learning 8 principles for Responsible ML - The Institute for Ethical AI & Machine Learning has put together 8 principles for responsible machine learning that are to be adopted by individuals and delivery teams designing, building and operating machine learning systems.
An Evaluation of Guidelines - The Ethics of Ethics; A research paper that analyses multiple Ethics principles.
ACM's Code of Ethics and Professional Conduct - This is the code of ethics that has been put together in 1992 by the Association for Computer Machinery and updated in 2018.

If you know of any guidelines that are not in the "Awesome AI Guidelines" list, please do give us a heads up or feel free to add a pull request!

About us

The Institute for Ethical AI & Machine Learning is a European research centre that carries out world-class research into responsible machine learning.

Check out our website

The Machine Learning Engineer

56,000 位关注者

要查看或添加评论，请登录

Alejandro Saucedo的更多文章

Issue #326 - The ML Engineer ??

2025年3月16日

Issue #326 - The ML Engineer ??

Thank you for being part of over 70,000+ ML professionals and enthusiasts who receive weekly articles & tutorials on…

2 条评论
Issue #324 - The ML Engineer ??

2025年3月2日

Issue #324 - The ML Engineer ??

Thank you for being part of over 70,000+ ML professionals and enthusiasts who receive weekly articles & tutorials on…
Issue #323 - The ML Engineer ??

2025年2月23日

Issue #323 - The ML Engineer ??

Thank you for being part of over 60,000+ ML professionals and enthusiasts who receive weekly articles & tutorials on…
Issue #322 - The ML Engineer ??

2025年2月16日

Issue #322 - The ML Engineer ??

Thank you for being part of over 60,000+ ML professionals and enthusiasts who receive weekly articles & tutorials on…

1 条评论
Issue #321 - The ML Engineer ??

2025年2月9日

Issue #321 - The ML Engineer ??

Thank you for being part of over 60,000+ ML professionals and enthusiasts who receive weekly articles & tutorials on…
Issue #320 - The ML Enginer ??

2025年2月2日

Issue #320 - The ML Enginer ??

Thank you for being part of over 60,000+ ML professionals and enthusiasts who receive weekly articles & tutorials on…

2 条评论
Issue #319 - The ML Engineer ??

2025年1月26日

Issue #319 - The ML Engineer ??

Thank you for being part of over 60,000+ ML professionals and enthusiasts who receive weekly articles & tutorials on…

3 条评论
Issue #318 - The ML Engineer ??

2025年1月19日

Issue #318 - The ML Engineer ??

Thank you for being part of over 60,000+ ML professionals and enthusiasts who receive weekly articles & tutorials on…

1 条评论
Issue #317 - The ML Engineer ??

2025年1月12日

Issue #317 - The ML Engineer ??

How time flies!! This week we still continue to celebrate 6 years since starting this weekly MLE newsletter ?????? What…

1 条评论
Issue #316 - The ML Engineer ??

2025年1月5日

Issue #316 - The ML Engineer ??

How time flies!! We are celebrating 6 years since starting this weekly MLE newsletter ?????? What started with just one…

3 条评论

See all articles

Issue #325 - The ML Engineer ??

Alejandro Saucedo

AI & Data Executive @ Zalando | Advisor @ UN, EU, ACM, etc | Join 70k+ ML Newsletter

This week in Machine Learning:?

Upcoming MLOps Events

Upcoming conferences where we're speaking:

Other upcoming MLOps conferences in 2025:

In case you missed our talks:

Open Source MLOps Tools

OSS: Policy & Guidelines?

The Machine Learning Engineer

56,000 位关注者

Alejandro Saucedo的更多文章

社区洞察

This week in Machine Learning:?

Upcoming MLOps Events

Upcoming conferences where we're speaking:

Other upcoming MLOps conferences in 2025:

In case you missed our talks:

Open Source MLOps Tools

OSS: Policy & Guidelines?

The Machine Learning Engineer

56,000 位关注者

Alejandro Saucedo的更多文章

Issue #326 - The ML Engineer ??

Issue #324 - The ML Engineer ??

Issue #323 - The ML Engineer ??

Issue #322 - The ML Engineer ??

Issue #321 - The ML Engineer ??

Issue #320 - The ML Enginer ??

Issue #319 - The ML Engineer ??

Issue #318 - The ML Engineer ??

Issue #317 - The ML Engineer ??

Issue #316 - The ML Engineer ??

社区洞察