登录查看更多内容

?????? LLMs Opening Their Inner Eyes

Pascal Biese

Daily AI highlights for 60k+ experts ???? AI/ML Engineer

发布日期: 2024年4月5日

+ 关注

In this issue:

LLaMA-2 performance at 0.001x the price
Trying to unify LLM evaluation
How the “Mind’s Eye” might help LLMs to “think” better

Click here to support

1. JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars

Watching: JetMoE (report/code)

What problem does it solve? Training Large Language Models (LLMs) has been notoriously expensive, with some models like GPT-3 costing over $10 million to train. This has led to a concentration of LLM development in a few well-resourced labs, limiting the democratization and diversity of these powerful AI tools. JetMoE-8B demonstrates that high-performing LLMs can be trained at a fraction of the cost, potentially opening up LLM research and application to a much wider range of institutions and developers.

How does it solve the problem? JetMoE-8B leverages a sparsely activated architecture inspired by ModuleFormer. While the model has 8 billion parameters in total, only 2.2 billion parameters are active during inference. This is achieved through the use of Mixture of Experts (MoE) layers, specifically Mixture of Attention heads (MoA) and Mixture of MLP Experts. Each MoA and MoE layer has 8 experts, but only 2 experts are activated for each input token. This sparse activation drastically reduces computational cost during inference while still allowing the model to learn from a large parameter space during training.

What's next? The development of JetMoE-8B could mark a significant shift in the accessibility of LLM technology. By demonstrating that high-performing models can be trained at a relatively low cost using only publicly available resources, this work may inspire more labs to research model pre-training.

领英推荐

?? 3 Ways to Efficient AI

Pascal Biese 6 个月前

Kosmos-1: An Insight into GPT-4

Elias Hamad 1 年前

Exploring the Difference Between Machine Learning and…

Navadeep Komarraju 5 个月前

2. Evalverse: Unified and Accessible Library for Large Language Model Evaluation

Watching: Evalverse (paper/code)

What problem does it solve? Evaluating Large Language Models (LLMs) can be a challenging task, especially for individuals without extensive AI expertise. The process often involves using multiple disparate tools, which can be time-consuming and complex. This fragmented approach to LLM evaluation makes it difficult for researchers and practitioners to comprehensively assess the performance of these models, hindering progress in the field.

How does it solve the problem? Evalverse addresses this issue by providing a unified, user-friendly framework that integrates various evaluation tools into a single library. By centralizing the evaluation process, Evalverse simplifies the task of assessing LLMs, making it accessible to a wider audience. The library's integration with communication platforms like Slack further enhances its usability, allowing users to request evaluations and receive detailed reports with ease.

What's next? The introduction of Evalverse opens up new possibilities for the widespread adoption of LLM evaluation. As more researchers and practitioners begin to utilize this centralized framework, we can expect to see a proliferation of insights into the performance and capabilities of LLMs. This, in turn, may drive further advancements in the field, as the increased accessibility of evaluation tools enables a broader range of individuals to contribute to the development and refinement of these powerful models.

3. Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models

Watching: VoT (paper)

What problem does it solve? Spatial reasoning, the ability to understand and manipulate spatial relationships between objects, is a fundamental aspect of human cognition. While Large Language Models (LLMs) have shown remarkable performance in various language comprehension and reasoning tasks, their capabilities in spatial reasoning have not been extensively explored. The Mind's Eye, a cognitive process that allows humans to create mental images of unseen objects and actions, is a key component of spatial reasoning. Developing methods to enhance spatial reasoning abilities in LLMs could lead to more human-like reasoning and problem-solving capabilities.

How does it solve the problem? Visualization-of-Thought (VoT) prompting is a novel approach that aims to improve the spatial reasoning abilities of LLMs by visualizing their reasoning traces and using these visualizations to guide subsequent reasoning steps. VoT prompting draws inspiration from the Mind's Eye process, enabling LLMs to generate mental images that facilitate spatial reasoning. The researchers applied VoT prompting to multi-hop spatial reasoning tasks, such as natural language navigation, visual navigation, and visual tiling in 2D grid worlds. By visualizing the reasoning traces of LLMs, VoT prompting provides a means to elicit and enhance spatial reasoning capabilities.

What's next? The experimental results demonstrate that VoT prompting significantly improves the spatial reasoning abilities of LLMs, even outperforming existing multimodal large language models (MLLMs) in the studied tasks. The success of VoT prompting in LLMs suggests its potential viability in MLLMs as well. Future research could focus on extending VoT prompting to more complex spatial reasoning tasks, exploring its applicability to other domains, and investigating the integration of VoT prompting with MLLMs to potentially get the best of both worlds.

Papers of the Week:

LLM Watch

45,316 位关注者

Jamaal L.

6 个月

VoT sounds like the concept of not just being able to produce an answer, but also being able to deduce 'why' that answer was given. Pretty cool leap towards the LLM emulation of actual critical thinking.

Vincent Valentine ??

CEO at Cognitive.Ai | Building Next-Generation AI Services | Available for Podcast Interviews | Partnering with Top-Tier Brands to Shape the Future

6 个月

Exciting research! The "Mind's Eye" approach seems very promising for enhancing spatial reasoning in Large Language Models. Pascal Biese

1 次回应

Marcelo Grebois

? Infrastructure Engineer ? DevOps ? SRE ? MLOps ? AIOps ? Helping companies scale their platforms to an enterprise grade level

6 个月

Fascinating research on enhancing spatial reasoning in LLMs! Can't wait to dive deeper into VoT and its implications. ?? Pascal Biese

1 次回应

Troy Schultz

AI Enthusiast, GenAI researcher, AI tools developer, chatbots. Innovator of Mermaid RAG LLMs utilizing Knowledge Graphs from text Input to flow maps of code, system diagrams, storyboards, consequence outcome prediction.

6 个月

Interesting concept, I love seeing all this diverse research across so many domains.

1 次回应

Daniel H.

6 个月

Thank you for sharing this information ?? - Dan

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

?????? LLMs Opening Their Inner Eyes

Pascal Biese

Daily AI highlights for 60k+ experts ???? AI/ML Engineer

In this issue:

1. JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars

领英推荐

2. Evalverse: Unified and Accessible Library for Large Language Model Evaluation

3. Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models

Papers of the Week:

LLM Watch

45,316 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

Exploring the Difference Between Machine Learning and Large Language Models (LLMs)

New "LLM Hallucination Index" reveals valuable insights about LLM hallucinations

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

RAG, Retrieval Augmented Generation

Claude 3: A First Look at this Exciting New Technology

David vs. Goliath: Microsoft's Phi-3 Challenges the Reign of Large Language Models

GPT-4: The Multimodal Language Model that Will Change the AI Game

Surviving the AI Storm: How Large Language Models are Reshaping Software and Shielding Hardware

Medusa: An AI Technique for Parallel Intelligence

GPT-3 and Document Extraction

In this issue:

1. JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars

领英推荐

2. Evalverse: Unified and Accessible Library for Large Language Model Evaluation

3. Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models

Papers of the Week:

LLM Watch

45,316 位关注者

?? LLMs Are Improving Themselves

2024年9月27日

?? A New Neural Architecture (Again)

2024年9月20日

?? What Next-Gen RAG Is About

2024年9月13日

?? The Next Level of CoT Prompting

2024年9月6日

?? Agents for Time Series Analysis

2024年8月30日

??? Agent-ception: When Agents Are Creating Agents

2024年8月23日

?? Apple's Answer to Complex LLM Evaluation

2024年8月16日

?? The Downsides of Structured Outputs

2024年8月9日

?????? Attention Is All Graphs Need

2024年8月2日

A Historic Week for ?O?p?e?n? ?S?o?u?r?c?e? AI

2024年7月26日

社区洞察

其他会员也浏览了

Exploring the Difference Between Machine Learning and Large Language Models (LLMs)

New "LLM Hallucination Index" reveals valuable insights about LLM hallucinations

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

RAG, Retrieval Augmented Generation

Claude 3: A First Look at this Exciting New Technology

David vs. Goliath: Microsoft's Phi-3 Challenges the Reign of Large Language Models

GPT-4: The Multimodal Language Model that Will Change the AI Game

Surviving the AI Storm: How Large Language Models are Reshaping Software and Shielding Hardware

Medusa: An AI Technique for Parallel Intelligence

GPT-3 and Document Extraction