登录查看更多内容

LAI #67: Reinforcement, Reasoning & the Next Wave of Image Generation Models

Towards AI

Making AI accessible to all with our courses, blogs, tutorials, books & community.

发布日期: 2025年3月20日

RFT, agentic reasoning, diffusion models, and the rise of Manus AI.

Good morning, AI enthusiasts! This week, we’re exploring Reinforcement Fine-Tuning. We’ll also get hands-on with agentic reasoning, building chatbots that don’t just respond but think and act, including a ReAct agent that blends reasoning with decision-making. Plus, we’re diving into diffusion models, Gaussian Mixture Models, and the latest AI sensation, Manus AI, that’s making waves in the community.??

Enjoy the read!?

What’s AI Weekly

This week in What’s AI, I will break down reinforcement fine-tuning, a technique that makes LLMs figure things out, get feedback, and improve over time. Reinforcement Fine-Tuning (RFT) changes how we customize AI models. Instead of retraining it with feeding examples of what we want and hoping it learns in a classical way, we actually teach it by rewarding correct answers and penalizing wrong ones, just like training a dog?—?but, you know, with fewer treats and more math. Read the complete article here or watch the video on YouTube.?

— Louis-Fran?ois Bouchard, Towards AI Co-founder & Head of Community

Learn AI Together Community section!

AI poll of the week!

How much more development effort is required to build production-level RAG than using frameworks like LangChain and LlamaIndex? Also, is it scalable in practicality? Tell us in the thread!?

Collaboration Opportunities?

The Learn AI Together Discord community is flooding with collaboration opportunities. If you are excited to dive into applied AI, want a study partner, or even want to find a partner for your passion project, join the collaboration channel! Keep an eye on this section, too—we share cool opportunities every week!?

1. Jiraiya9027 is learning the mathematical and coding concepts for GenAI models like GANs and diffusion models. If you would like to learn together, reach out in the thread!?

2. Gere030199 is working on a project and is looking for collaborators. Contact him to learn more!?

3. Kpr27 is looking for a group of beginner to intermediate AI developers to learn together and share resources. If this sounds relevant, contact him in the thread or add him to your group!?

Meme of the week!

Meme shared by bin4ry_d3struct0r

TAI Curated Section

Article of the week

RAG Intelligent Upgraded: Agentic RAR + Nano-GraphRAG + Claude 3.7 Sonnet (Oxford Univ) By Gao Dalie (高達烈)

This article explores the building of an agentic reasoning chatbot using LangGraph, Agentic RAR, Nano-GraphRAG, and Claude 3.7 Sonnet. It addresses the limitations of Large Language Models (LLMs) in sequential reasoning and introduces Agentic RAR as a solution that enhances the familiar retrieval-oriented generation (RAG) model by introducing a dynamic, reasoning-first approach. The author details the structure of the agent, explaining how it uses specialized agents to perform web searches, execute code, and create mind maps. It also discusses the challenges faced during the development, emphasizing the importance of state management and knowledge integration before offering solutions that were implemented. The process of setting up the environment, loading the dataset, initializing the model and trainer, and fine-tuning using Supervised Fine-Tuning (SFT) with LoRA is explained in detail. Furthermore, It describes the functions of Nano-GraphRAG and Agentic RAR, illustrating how the integration of these technologies can lead to the development of more sophisticated and efficient AI systems.

Our must-read articles

1. Reimagining Diffusion Models: Autoregressive Priors for Efficient Initialization By Shenggang Li

This blog presents a method for improving diffusion models using Autoregressive Priors (ARPs). Instead of starting with pure Gaussian noise, the proposed approach initializes the diffusion process with structured priors generated by autoregressive models, enhancing both the speed and quality of generated samples. It details two methods for integrating AR priors: AR Consistency Loss Diffusion, which uses an augmented loss function, and AR Prior Blending Diffusion, which blends the AR prior with Gaussian noise during inference. The methods are tested on the MNIST dataset, where AR Prior Blending Diffusion demonstrates superior performance in loss reduction and image sharpness. The findings suggest that ARPs can make models more data-aware and efficient, with potential applications in finance and healthcare.

2. Classics Never Fade Away: Decipher Gaussian Mixture Model and Its Variants! By? Kaitai Dong

While deep learning gets a lot of attention, GMMs offer a valuable statistical approach, especially when data doesn't fit into clear, separated groups. The piece explores the Gaussian Mixture Model (GMM), positioning it as a strong tool for clustering and anomaly detection, sometimes superior to K-means. It explains GMM's components—means, covariance matrices, and mixing coefficients—and the Expectation-Maximization (EM) algorithm used to estimate these parameters. It then moves to practical applications, illustrating GMM's use in Python with scikit-learn and compares different covariance types to optimize performance. It also addresses how to determine the ideal number of clusters and introduces GMM variants like Bayesian, Robust, and Online GMM.

3. Manus AI + Ollama: Build & Scrape ANYTHING (First-Ever General AI Agent) = OpenManus By? Gao Dalie (高達烈)

This article introduces Manus, a general AI agent designed to independently think, plan, and execute tasks, delivering tangible results unlike traditional AI assistants. The author highlights its superior performance on the GAIA benchmark and its ability to tackle real-world problems. It also explores OpenManus, an open-source alternative built upon MetaGPT. OpenManus features a modular agent system with distinct roles, like project managers and planning agents, integrating top models such as Claude 3.5 and Qwen VL Plus. It emphasizes real-time feedback and a powerful toolchain for tasks like code execution and web browsing. The author then offers a guide to local configuration using Conda and Ollama. Finally, it discusses how Manus differentiates itself from DeepSeek and ChatGPT, highlighting its action-oriented capabilities.

4. Building Simple ReAct Agent from Scratch In Python By Youssef Hosni

This article details the creation of a simple ReAct agent using Python and LangGraph. It is part of a series designed to take readers from basic to advanced LLM agent building. The author introduces the ReAct framework, which integrates reasoning with action-taking for more sophisticated AI. The article defines an agent class with a system prompt and action functions (like fetching weather or converting temperatures). It also provides a step-by-step guide to testing the agent with examples and automating its calls.

5. A Close Look at Image Generation Using p-PCA and Variational Autoencoders By Shivam Dattatray Shinde

This article explores techniques for image generation, contrasting autoencoders with latent variable models like Probabilistic Principal Component Analysis (p-PCA) and Variational Autoencoders (VAEs). It explains how standard autoencoders, primarily used for reconstruction, fall short in image generation due to an unknown latent space distribution. p-PCA, a linear model, uses a Gaussian distribution in the latent space but produces blurry images. VAEs, a non-linear extension, enforce an approximate normal distribution, generating sharper images. The author details the reparameterization trick used in VAEs and discusses the β hyperparameter's role in preventing posterior collapse. Finally, it visually demonstrates image transitions by interpolating between latent vectors, showcasing the potential for controlled image manipulation.

If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.

要查看或添加评论，请登录

Towards AI的更多文章

See all articles

RFT, agentic reasoning, diffusion models, and the rise of Manus AI.

What’s AI Weekly

Learn AI Together Community section!

AI poll of the week!

Collaboration Opportunities?

Meme of the week!

TAI Curated Section

Article of the week

Our must-read articles

Towards AI的更多文章

TAI #144: OpenAI’s Responses API for Agent Development; Gemini Flash 2.0 Wins the Race for LLM Image Generation

Help Us Build the AI Agents Course You Actually Want

LAI #66: Information Theory for People in a Hurry

TAI #143: New Scaling Laws Incoming? Ilya’s SSI Raises at $30bn, Manus Takes AI Agents Mainstream

LAI #65 What Happens When You Combine LangGraph, DeepSeek-R1, Function Call, & Agentic RAG

Introducing Our Python Primer for Generative AI

TAT #142: GPT-4.5 Released—But Can It Stack Up Against Reasoning Models?

Introducing Our Python Primer for Generative AI

#64 Here’s how you keep up with AI!

TAI #141: Claude 3.7 Sonnet; Software Dev Focus in Anthropic’s First Thinking Model