登录查看更多内容

?? All You Need to Know About Small Language Models

Pascal Biese

Daily AI highlights for 70k+ experts ???? AI/ML Engineer

发布日期: 2024年11月1日

+ 关注

In this issue:

A survey on SLMs
A way towards more brain-like inference
How to better count the r’s in strawberry

MLOps/GenAI World is all about solving real-world problems and sharing genuine experiences with production-grade AI systems.

Join leaders and engineers from Microsoft, Huggingface, BlackRock and many more for the following tracks:

Real World Case Studies
Business & Strategy
Technical & Research (levels 1-7)
Workshops (levels 1-7)
In-person coding sessions

Get Access to 30+ virtual workshops, 60+ in-person talks and 90+ hours of recordings by claiming your personal discount.

Last Chance to Save $75 USD

1. A Survey of Small Language Models

Watching: “Small” Language Models (paper)

What problem does it solve? While Large Language Models (LLMs) have been dominating the headlines, Small Language Models (SLMs) are becoming increasingly important. SLMs are designed to be efficient and performant while requiring minimal computational resources. This makes them ideal for various settings, including on-device, mobile, and edge devices. As the demand for language models in resource-constrained environments grows, the need for a comprehensive understanding of SLMs becomes crucial.

How does it solve the problem? The survey presents a novel taxonomy for categorizing the methods used to optimize SLMs. It covers various techniques, including model compression, pruning, and quantization. Model compression techniques aim to reduce the size of the model while maintaining its performance. Pruning involves removing less important weights or connections from the model, reducing its complexity. Quantization techniques focus on reducing the precision of the model's parameters, leading to smaller model sizes and faster inference times. By systematically organizing these methods, the survey provides a clear overview of the approaches used to create efficient SLMs.

What's next? Despite the advancements in SLMs, several open challenges remain to be addressed. These challenges may include further improving the efficiency-performance trade-off, developing more effective compression techniques, and ensuring the robustness and generalization capabilities of SLMs across various tasks and domains. Additionally, there is a need for standardized benchmark datasets and evaluation metrics specifically tailored for SLMs to facilitate fair comparisons and track progress in the field.

领英推荐

Understanding Multimodal LLMs

Sebastian Raschka, PhD 4 个月前

RAG Techniques Every AI/ML/Data Engineer Should Know!

Pavan Belagatti 6 个月前

Almost Timely News: How Large Language Models Are…

Christopher Penn 2 年前

2. A prescriptive theory for brain-like inference

Watching: Brain-like inference (paper)

What problem does it solve? The Evidence Lower Bound (ELBO) is a widely used objective function for training deep generative models like Variational Autoencoders (VAEs). While ELBO maximization has been useful in interpreting generative models, including diffusion models, it is often considered too broad to provide specific guidance for designing architectures in neuroscience or machine learning. This work aims to bridge the gap between ELBO maximization and prescriptive theories for NeuroAI.

How does it solve the problem? The authors show that maximizing ELBO under Poisson assumptions for general sequence data leads to a spiking neural network called the iterative Poisson VAE (iP-VAE). This model performs Bayesian posterior inference through its membrane potential dynamics, establishing a closer connection to biological neurons compared to previous brain-inspired predictive coding models based on Gaussian assumptions. The iP-VAE learns sparser representations and demonstrates better generalization to out-of-distribution samples compared to amortized and iterative VAEs.

What's next? The findings suggest that optimizing ELBO with Poisson assumptions provides a solid foundation for developing prescriptive theories in NeuroAI. This approach could lead to more biologically plausible models that better capture the dynamics of real neurons while maintaining the benefits of deep generative models. Additionally, the insights gained from this work could inspire new architectures and training strategies in both neuroscience and machine learning.

3. Counting Ability of Large Language Models and Impact of Tokenization

Watching: Tokenization (paper)

What problem does it solve? Transformers, the architecture behind most modern Large Language Models (LLMs), have inherent limitations when it comes to reasoning capabilities. Unlike recurrent neural networks, Transformers lack recurrent connections, which limits their computational depth. This places them in the complexity class TC0, making them theoretically incapable of solving tasks that require increasingly deep reasoning as input length grows. Counting, a fundamental component of many reasoning tasks, is one such task that requires reasoning depth to grow linearly for inductive performance.

How does it solve the problem? Recent work has shown that Chain of Thought (CoT) reasoning can help alleviate some of the architectural limitations of Transformers in counting tasks. However, the role of tokenization in these models has received little attention. Unlike expert models that often use character-level tokenization, LLMs typically rely on byte-level (BPE) tokenizers, which fundamentally alters the way reasoning is processed. This study investigates the impact of tokenization on the counting abilities of LLMs and uncovers substantial performance variations based on input tokenization differences.

What's next? The findings of this study highlight the importance of considering tokenization choices when designing and evaluating LLMs for reasoning tasks. By understanding how tokenization can undermine models' theoretical computability, researchers can develop new tokenization methods that enhance reasoning capabilities in LLMs. This work opens up new avenues for improving the reasoning abilities of Transformer-based models and brings us closer to creating LLMs that can handle reasoning tasks more reliably.

Papers of the Week:

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Engineering Trustworthy AI: A Developer Guide for Empirical Risk Minimization
Designing LLM-Agents with Personalities: A Psychometric Approach
PDL: A Declarative Prompt Programming Language
Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance
LongReward: Improving Long-context Large Language Models with AI Feedback
Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design
Mapping the Neuro-Symbolic AI Landscape by Architectures: A Handbook on Augmenting Deep Learning Through Symbolic Reasoning
Standardization Trends on Safety and Trustworthiness Technology for Advanced AI
ADAM: An Embodied Causal Agent in Open-World Environments
ProMoE: Fast MoE-based LLM Serving using Proactive Caching

?? If you enjoyed this article, give it a like and share it with your peers.

LLM Watch

53,975 位关注者

Peter Bellen

Blog for AI Articles

4 个月

"Supercomputers and AI"?-->..... A brandnew article. Sites?: English : https://aifornoobsandexperts.com/climate-models-and-ai/ Nederlands :?https://aivoorjanenalleman.nl/klimaatmodellen-en-ai/

Ryan Dsouza

Founder & Fractional Chief AI Officer building AI-First Engineering Products & Organisations | Passionate about the intersection of Art, Design & Technology | Fine Art Photographer

4 个月

Exactly Pascal, Addressing those challenges is key to unlocking SLM potential.

Shahid Hussain

ML Engineer at byMind Solutions | NLP | LLMs | GenAI | Chatbots

4 个月

focus on efficiency for on device and edge applications is incredibly timely and the new taxonomy for SLM optimization techniques is a valuable addition to the field. Great to see key challenges like robustness and interpretability highlighted essential areas for future progress!

2 次回应

Francis Namugowa

4 个月

Can't wait

Bill Staikos

I help companies drive revenue, reduce costs, and improve culture, scaling business outcomes through AI and Analytics.

4 个月

Always insightful. Thanks for sharing.

1 次回应

查看更多评论

要查看或添加评论，请登录

Pascal Biese的更多文章

?? Vibe Coding + Knowledge Graphs = 10x Cheaper

2025年3月28日

?? Vibe Coding + Knowledge Graphs = 10x Cheaper

In this issue: Repository-level software engineering Chain-of-Tools for better tool calling The most complete AI model…

2 条评论
?? Quantum-Enhanced AI - It's Here

2025年3月21日

?? Quantum-Enhanced AI - It's Here

In this issue: Chinese researchers introduce quantum-enhanced fine-tuning Enabling open-source reinforcement learning…

4 条评论
?? Search-R1, Gemini Embeddings & Controlled Reasoning with L1

2025年3月14日

?? Search-R1, Gemini Embeddings & Controlled Reasoning with L1

In this issue: Emergent search behavior in LLMs Stopping reasoning models from “overthinking” The best embeddings - for…

1 条评论
?? QwQ-32B: 20x smaller than DeepSeek-R1

2025年3月7日

?? QwQ-32B: 20x smaller than DeepSeek-R1

In this issue: China just did it again: a new open source powerhouse The art of post-training reasoning models A new…

6 条评论
OpenAI Can Not Be Happy About This

2025年2月28日

OpenAI Can Not Be Happy About This

In this issue: OpenAI releases first “vibe” model Microsoft bets on data quality and efficiency When old benchmarks…
?????? One Giant Leap for AI Optimization

2025年2月21日

?????? One Giant Leap for AI Optimization

In this issue: Sakana’s AI CUDA Engineer Inner Thinking Transformers Better Code Generation for any model Accelerate…
LLM Watch#74: DeepSeek-R1 Was Only The Beginning

2025年2月14日

LLM Watch#74: DeepSeek-R1 Was Only The Beginning

In this issue: 1B model > 405B model AI winning Olympic Gold Generating world models on the fly For those of you that…

5 条评论
?? Massive Progress in Reasoning Models

2025年2月7日

?? Massive Progress in Reasoning Models

In this issue: Beating OpenAI with Open-Source 99% performance with only 1% data Chain-of-Associated-Thoughts (CoAT)…

2 条评论
??? Automatic Prompt Engineering 2.0

2025年1月31日

??? Automatic Prompt Engineering 2.0

Foreword: hi everyone, I hope you had a great week! Before we dive into this newsletter and its (hopefully) exciting…

5 条评论
?? This AI Makes Big Tech Panic

2025年1月24日

?? This AI Makes Big Tech Panic

In this issue: Re-defining what’s possible in AI DeepMind going even deeper Self-training agents are coming 1…

11 条评论

See all articles

?? All You Need to Know About Small Language Models

Pascal Biese

Daily AI highlights for 70k+ experts ???? AI/ML Engineer

In this issue:

1. A Survey of Small Language Models

领英推荐

2. A prescriptive theory for brain-like inference

3. Counting Ability of Large Language Models and Impact of Tokenization

Papers of the Week:

?? If you enjoyed this article, give it a like and share it with your peers.

LLM Watch

53,975 位关注者

Pascal Biese的更多文章

社区洞察

其他会员也浏览了

How Google is Expanding Reasoning Capabilities of Language Models

Explainability of LLMs – Survey; Reduce Hallucination in LLMs; LLM-based Agents - Survey; RAG Pipelines with Llama; and More

Issue #228 - THE ML ENGINEER ??

The Rise of Small Language Models

Thinking Smaller - Small Language Models

The Four Trends Reshaping the Landscape of Large Language Models (LLMs)

Top 7 Common Misconceptions of Large Language Models (LLM) Debunked

#66 The Captivating Appeal of LoRA in Large Language Models

Building a Q&A on custom docs using LangChain

LLM vs. LCM: The Evolution of AI from Language to Concepts

In this issue:

1. A Survey of Small Language Models

领英推荐

2. A prescriptive theory for brain-like inference

3. Counting Ability of Large Language Models and Impact of Tokenization

Papers of the Week:

?? If you enjoyed this article, give it a like and share it with your peers.

LLM Watch

53,975 位关注者

Pascal Biese的更多文章

?? Vibe Coding + Knowledge Graphs = 10x Cheaper

?? Quantum-Enhanced AI - It's Here

?? Search-R1, Gemini Embeddings & Controlled Reasoning with L1

?? QwQ-32B: 20x smaller than DeepSeek-R1

OpenAI Can Not Be Happy About This

?????? One Giant Leap for AI Optimization

LLM Watch#74: DeepSeek-R1 Was Only The Beginning

?? Massive Progress in Reasoning Models

??? Automatic Prompt Engineering 2.0

?? This AI Makes Big Tech Panic

社区洞察

其他会员也浏览了

How Google is Expanding Reasoning Capabilities of Language Models

Explainability of LLMs – Survey; Reduce Hallucination in LLMs; LLM-based Agents - Survey; RAG Pipelines with Llama; and More

Issue #228 - THE ML ENGINEER ??

The Rise of Small Language Models

Thinking Smaller - Small Language Models

The Four Trends Reshaping the Landscape of Large Language Models (LLMs)

Top 7 Common Misconceptions of Large Language Models (LLM) Debunked

#66 The Captivating Appeal of LoRA in Large Language Models

Building a Q&A on custom docs using LangChain

LLM vs. LCM: The Evolution of AI from Language to Concepts