登录查看更多内容

Open-Source and Science in the Era of Foundation Models

Ramesh Perumal PhD

AI Solution Architect | SMIEEE | Edge AI | Computer Vision | GenAI | MLOps | Taiwan Employment Gold Card Recipient | Healthcare & Life Sciences

发布日期: 2024年12月10日

Welcome to the summary of the tenth lecture of the LLM Agents course conducted by University of California, Berkeley. Refer to this link for the summary of the previous lectures.

While the capabilities of LLMs increase exponentially over the years, the model accessibility decreases at the same rate resulting in reduced understanding of the model architecture. The model accessibility is categorized into the following levels: 1) APIs (GPT-4): providing a black-box view of the model through prompt and response, 2) open-weight (Llama): the model weights are widely available to understand the mechanisms, and 3) open-source (StarCoder): allowing the users to use/study/modify/share the model without permission. Integrating the API calls together leads to agents that helps augmenting the capability of LLMs through reasoning and action. There are two types of agents: problem-solving and simulation.

Given a ML problem, the objective of the problem-solving agent is to build the best model by writing code, editing model parameters, executing code and analyzing the results. MLAgentBench is a suite of 13 tasks ranging from improving model performance on CIFAR-10 to recent research problems like BabyLM. It is used to evaluate the performance of agents in ML experimentation which involves improving the model performance by iteratively finetuning the model parameters. The results imply that the success rate of the highly performant Claude v3 Opus model varies from 100% on older datasets to 0% on the new datasets created after the model’s cutoff date. This approach leads to self-improvement enabling the agents to solve the tasks in a better way by improving the model.? ?????

领英推荐

Time Series Forecasting and Simulations, Small…

Open Data Science Conference (ODSC) 9 个月前

??Top ML Papers of the Week

DAIR.AI 4 个月前

AI Is Not A Puppy That Needs Adoption.

Dr. Anastassia Lauterbach 5 个月前

Generative agents are computational software agents that simulate believable human behavior. In this work, the multi-agent architecture extends an LLM to store a complete record of the agent's experiences using NLP, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior based on recency, importance, and relevance. It is inferred that the architecture components: observation, planning and reflection, enable the generative agents to produce believable individual and emergent social behaviors.

The key challenges with the API-based LLM agents are reproducibility, interpretability, and the limited long-term planning capabilities. To further improve them, it is critical to get deeper access to the model. When LLMs behave unexpectedly (Ex: many LLMs incorrectly stating 9.8 < 9.11), it is important to understand the internal representations of the thought process. To address this, Transluce developed Monitor, an observability interface designed to help humans observe, understand, and steer the internal computations of LLMs. This work has demonstrated significant compute cost savings and MMLU scores by pruning Nemotron-4 15B model into Minitron (8B) through distillation-based retraining with <3% of the original training data. The above two findings are possible only because the model weights are widely available. With the open-source models, we get full access to the data information and the source code used for data processing, model training and validation. This facilitates optimizing the data mixtures during training that enables the model to achieve the baseline accuracy withy 2.6x fewer training steps. The current research efforts span across building low-scale models with the potential to scale well and harnessing distributed heterogenous, low-bandwidth compute environments for training models to address the compute problem.

要查看或添加评论，请登录

Ramesh Perumal PhD的更多文章

Towards Building Safe & Trustworthy AI Agents and A Path for Science? and Evidence?based AI Policy

2024年12月12日

Towards Building Safe & Trustworthy AI Agents and A Path for Science? and Evidence?based AI Policy

Welcome to the summary of the twelfth lecture of the LLM Agents course conducted by University of California, Berkeley.…
Measuring Agent capabilities and Anthropic’s RSP

2024年12月12日

Measuring Agent capabilities and Anthropic’s RSP

Welcome to the summary of the eleventh lecture of the LLM Agents course conducted by University of California…
Project GR00T: Training Robots through Large-Scale Simulation Frameworks

2024年12月9日

Project GR00T: Training Robots through Large-Scale Simulation Frameworks

Welcome to the summary of the nineth lecture of the LLM Agents course conducted by University of California, Berkeley…

2 条评论
Towards a unified framework of Neural and Symbolic Decision Making

2024年11月23日

Towards a unified framework of Neural and Symbolic Decision Making

Welcome to the summary of the eighth lecture of the LLM Agents course conducted by University of California, Berkeley…
Agents for Enterprise Workflows

2024年11月10日

Agents for Enterprise Workflows

Welcome to the summary of the seventh lecture of the LLM Agents course conducted by University of California, Berkeley.…
Agent-driven Autonomous Software Development

2024年11月8日

Agent-driven Autonomous Software Development

Welcome to the summary of the sixth lecture on the LLM Agents course conducted by University of California, Berkeley…
Building Reliable Compound AI Systems using DSPy Framework

2024年10月28日

Building Reliable Compound AI Systems using DSPy Framework

Welcome to the summary of the fifth lecture on the LLM Agents course conducted by University of California, Berkeley…

4 条评论
Enterprise Trends for GenAI and Tools for Customizing LLMs for Domain-Specific Use Cases

2024年10月14日

Enterprise Trends for GenAI and Tools for Customizing LLMs for Domain-Specific Use Cases

Welcome to the summary of the fourth lecture on the LLM Agents course conducted by University of California, Berkeley…
Agentic AI Frameworks and Applications

2024年10月4日

Agentic AI Frameworks and Applications

Welcome to the summary of the third lecture on Agentic AI frameworks and applications as part of the LLM Agents course…

4 条评论
Running Llama-3.2-based Chatbot on Intel Core Ultra Processor using OpenVINO-GenAI

2024年9月28日

Running Llama-3.2-based Chatbot on Intel Core Ultra Processor using OpenVINO-GenAI

This article presents the steps to quantize the Llama-3.2-3B-Instruct model using Optimum-Intel and run the chatbot…

See all articles

Open-Source and Science in the Era of Foundation Models

Ramesh Perumal PhD

AI Solution Architect | SMIEEE | Edge AI | Computer Vision | GenAI | MLOps | Taiwan Employment Gold Card Recipient | Healthcare & Life Sciences

领英推荐

Ramesh Perumal PhD的更多文章

社区洞察

其他会员也浏览了

Using Generative Adversarial networks (GANs) to augment data

TWIML Generative AI Meetup - October 11th, 2024

How to Predict AI vs Human-Written Essays: Hackathon Challenge Solution

AI-Enhanced Indexing: Learned Index Structures

What is DeepSeek R1 and Why Should You Care?

Transformers & the Math Behind Superintelligence?-?Part 3 (of?3)

Book Review: The Master Algorithm

O3 and the Road to AGI: Has OpenAI Crossed the Threshold?

How Big MNC’s Are Using AI (Artificial Intelligence) And How does it Benefit Them

Graphical Event Models: Discovering Sparkling GEMs in the Rubble of Event Data

领英推荐

Ramesh Perumal PhD的更多文章

Towards Building Safe & Trustworthy AI Agents and A Path for Science? and Evidence?based AI Policy

Measuring Agent capabilities and Anthropic’s RSP

Project GR00T: Training Robots through Large-Scale Simulation Frameworks

Towards a unified framework of Neural and Symbolic Decision Making

Agents for Enterprise Workflows

Agent-driven Autonomous Software Development

Building Reliable Compound AI Systems using DSPy Framework

Enterprise Trends for GenAI and Tools for Customizing LLMs for Domain-Specific Use Cases

Agentic AI Frameworks and Applications

Running Llama-3.2-based Chatbot on Intel Core Ultra Processor using OpenVINO-GenAI

社区洞察

其他会员也浏览了

Using Generative Adversarial networks (GANs) to augment data

TWIML Generative AI Meetup - October 11th, 2024

How to Predict AI vs Human-Written Essays: Hackathon Challenge Solution

AI-Enhanced Indexing: Learned Index Structures

What is DeepSeek R1 and Why Should You Care?

Transformers & the Math Behind Superintelligence?-?Part 3 (of?3)

Book Review: The Master Algorithm

O3 and the Road to AGI: Has OpenAI Crossed the Threshold?

How Big MNC’s Are Using AI (Artificial Intelligence) And How does it Benefit Them

Graphical Event Models: Discovering Sparkling GEMs in the Rubble of Event Data