登录查看更多内容

Expanding Context Lengths in LLMs; Towards CausalGPT; Perplexity vs. Bard vs. GPT; Meet TinyLama; Leveraging qLoRA For Fine-Tuning; and More.

Danny Butvinik

Chief Data Scientist | 100K+ Followers | FinCrime | Writer | Author of AI Vanguard Newsletter

发布日期: 2023年9月12日

Editor's Paper Recommendations

On the Unexpected Abilities of Large Language Models: Large language models can display a wide range of abilities that are not directly connected with the task for which they are trained: predicting the next words of human-written texts. In this article, I discuss the nature of this indirect acquisition process and its relation to other known indirect processes. An important side effect of such indirect acquisition is the development of integrated abilities. I discuss the extent to which the abilities developed by large language models are predictable. Finally, I briefly discuss the relation between the cognitive skills acquired by these systems and human cognition.

Giraffe: Adventures in Expanding Context Lengths in LLMs: Modern large language models (LLMs) that rely on attention mechanisms are typically trained with fixed context lengths, which enforce upper limits on the length of input sequences that they can handle at evaluation time. To use these models on sequences longer than the train-time context length, one might employ techniques from the growing family of context length extrapolation methods -- most of which focus on modifying the system of positional encodings used in the attention mechanism to indicate where tokens or activations are located in the input sequence. We conduct a wide survey of existing methods of context length extrapolation on a base LLaMA or LLaMA 2 model and introduce some of our designs as well -- in particular, a new truncation strategy for modifying the basis for the position encoding. The paper tests these methods using three new evaluation tasks (FreeFormQA, AlteredNumericQA, and LongChat-Lines) and perplexity, which we find to be less fine-grained as a measure of the long context performance of LLMs. We release the three tasks publicly as datasets on HuggingFace. We discover that linear scaling is the best method for extending context length and show that further gains can be achieved by using longer scales at evaluation time. We also discover promising extrapolation capabilities on a truncated basis. To support further research, we release three new 13B parameter long-context models called Giraffe: 4k and 16k context models trained from base LLaMA-13B and a 32k context model trained from base LLaMA2-13B. We also release the code to replicate our results.

Towards CausalGPT: A Multi-Agent Approach for Faithful Knowledge Reasoning via Promoting Causal Consistency in LLMs: Despite advancements in LLMs, knowledge-based reasoning remains a longstanding issue due to the fragility of knowledge recall and inference. Existing methods primarily encourage LLMs to plan and solve problems autonomously or to extensively sample reasoning chains without addressing the conceptual and inferential fallacies. Attempting to alleviate inferential fallacies and drawing inspiration from multi-agent collaboration, we present a framework to increase faithfulness and causality for knowledge-based reasoning. Specifically, we propose to employ multiple intelligent agents (i.e., reasoner and causal evaluator) to work collaboratively in a reasoning-and-consensus paradigm for elevated reasoning faithfulness. The reasoners focus on providing solutions with human-like causality to solve open-domain problems. On the other hand, the causal evaluator agent scrutinizes if the answer in a solution is causally deducible from the question and vice versa, with a counterfactual answer replacing the original. According to the extensive and comprehensive evaluations on various knowledge reasoning tasks (e.g., science question answering and commonsense reasoning), our framework outperforms all compared state-of-the-art approaches by large margins.

Boosting Logical Reasoning in Large Language Models through a New Framework: The Graph of Thought: Recent advancements in large-scale models, such as GPT-4, have showcased remarkable capabilities in addressing standard queries. However, when facing complex problems that require multi-step logical reasoning, their accuracy dramatically decreases. Current research has explored the realm of \textit{prompting engineering} to bolster the inferential capacities of these models. Our paper unveils a pioneering prompting technique, dubbed \textit{Graph of Thoughts (GoT)}. Through testing on a trio of escalating challenges: the 24-point game, resolution of high-degree polynomial equations, and derivation of formulas for recursive sequences, our method outperformed GPT-4, achieving accuracy improvements of 89.7%, 86%, and 56% for each respective task. Moreover, when juxtaposed with the state-of-the-art (SOTA) prompting method, \textit{Tree of Thought (ToT)}, our approach registered an average accuracy boost of 23%, 24%, and 15%.

Industry Insights

Perplexity vs. Bard vs. ChatGPT
Leveraging qLoRA for Fine-Tuning of Task-Fine-Tuned Models Without Catastrophic Forgetting
Meet TinyLlama: A Small AI Model that Aims to Pretrain a 1.1B Llama Model on 3 Trillion Tokens
New Open-Source ‘Falcon’ AI-Language Model Overtakes Meta and Google

领英推荐

A Historic Week for ?O?p?e?n? ?S?o?u?r?c?e? AI

Pascal Biese 7 个月前

Geometric Interpretation of Transformers; Survey of…

Danny Butvinik 1 年前

SLM and LLM... My Top 10 in July 2024

Fabrizio Degni 9 个月前

Are you looking to advertise a product, job opening, or event to an audience of over 35,000 AI researchers and engineers? Get in touch with us on?LinkedIn?to explore your options.

Enjoy the newsletter? Help us make it bigger and better by sharing it with colleagues and friends.

?--

Growth Zone

How to Stop Procrastinating
How Managing Your Anxiety Can Make You a Better Leader

The AI Vanguard

43,544 位关注者

Digvijay Singh

?I help Businesses Upskill their Employees in Data Science Technology - AI, ML, RPA

1 年

Great insights as always, Danny! The AI Vanguard Newsletter never disappoints. For more valuable content on data science and AI, check out the Data Science Catalog (https://lnkd.in/gi8Sm9cx) and the YouTube channel (https://lnkd.in/gD54ZjUh). #datalovers #AIcommunity

2 次回应

要查看或添加评论，请登录

Danny Butvinik的更多文章

Assessing GPT-4 on Reasoning; Mathematical Perspective On Transformers; Family Of Multimodal Models; Why Small LMs Are The Next Thing; and More.

2024年4月18日

Assessing GPT-4 on Reasoning; Mathematical Perspective On Transformers; Family Of Multimodal Models; Why Small LMs Are The Next Thing; and More.

Editor's Paper Recommendations Assessing GPT4-V on Structured Reasoning Tasks: Multi-modality promises to unlock…

7 条评论
First Hallucination-Free LLM; Fine-Tune or Retrieval; Privacy Issues in LLMs; New Embedding Model by Google; What Resilience Means and More.

2024年4月4日

First Hallucination-Free LLM; Fine-Tune or Retrieval; Privacy Issues in LLMs; New Embedding Model by Google; What Resilience Means and More.

Editor's Paper Recommendations Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs: The ability of large…

3 条评论
LLM Fine-Tuning on Graphs; How To Evaluate LLMs; Uncovering Knowledge Gaps Using RAG; Claud 3 on Bedrock; Overcoming Limits Of RAG; and More.

2024年3月12日

LLM Fine-Tuning on Graphs; How To Evaluate LLMs; Uncovering Knowledge Gaps Using RAG; Claud 3 on Bedrock; Overcoming Limits Of RAG; and More.

Editor's Paper Recommendations Efficient Large Language Models Fine-Tuning on Graphs: Learning from Text-Attributed…

5 条评论
Generation Model – What Do They Know? Cracking Length Generalization: AI's Reasoning Evolution; Can We Drastically Reduce Training Costs?; and More.

2024年3月3日

Generation Model – What Do They Know? Cracking Length Generalization: AI's Reasoning Evolution; Can We Drastically Reduce Training Costs?; and More.

Editor's Paper Recommendations ChatGPT’s First Anniversary: Are Open-Source Large Language Models Catching Up?: Upon…

7 条评论
Multimodal LLMs; Orca 2; Cosmopedia – Largest Open Synthetic Data by Huggin Face; How To Fine-Tune On Single GPU; and More.

2024年2月27日

Multimodal LLMs; Orca 2; Cosmopedia – Largest Open Synthetic Data by Huggin Face; How To Fine-Tune On Single GPU; and More.

Editor's Paper Recommendations Multimodal Large Language Models: A Survey: The exploration of multimodal language…

1 条评论
ChatGPT vs Gemini; Uncertainty Quantification in GenAI; GPT-4 vs. GPT-4V vs. Humans On Abstraction and Reasoning; Private vs Public LLMs; and More.

2024年2月20日

ChatGPT vs Gemini; Uncertainty Quantification in GenAI; GPT-4 vs. GPT-4V vs. Humans On Abstraction and Reasoning; Private vs Public LLMs; and More.

Editor's Paper Recommendations Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders…

5 条评论
Survey on Hallucination in LLM; LLM’s Understanding Math; GPT4All Open-Source LMs; Next Chapter of Gemini; Improved GPT-4 Performance; and More.

2024年2月13日

Survey on Hallucination in LLM; LLM’s Understanding Math; GPT4All Open-Source LMs; Next Chapter of Gemini; Improved GPT-4 Performance; and More.

Editor's Paper Recommendations The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using…

1 条评论
Bard vs. ChatGPT; Jina Embedding 2; Text2Structure; Does GPT-4 Pass Turing Text?; Transformer As Graph2Graph; and More.

2024年2月6日

Bard vs. ChatGPT; Jina Embedding 2; Text2Structure; Does GPT-4 Pass Turing Text?; Transformer As Graph2Graph; and More.

Editor's Paper Recommendations From Text to Structure: Using Large Language Models to Support the Development of Legal…

13 条评论
Hallucination in LLMs – Perspectives and Remediations; Fine-Tuning With Feedback; What LLMs DO NOT KNOW; LLaMA 2 Explained; and More.

2024年1月30日

Hallucination in LLMs – Perspectives and Remediations; Fine-Tuning With Feedback; What LLMs DO NOT KNOW; LLaMA 2 Explained; and More.

Editor's Paper Recommendations Fine-Tuning Language Models Using Formal Methods Feedback: Although pre-trained language…

9 条评论
What Algorithms Can Transformers Learn; Reasoning Agent for Graphs; Supervised Fine-Tuning; Context Understanding in LLMs; and More.

2024年1月23日

What Algorithms Can Transformers Learn; Reasoning Agent for Graphs; Supervised Fine-Tuning; Context Understanding in LLMs; and More.

Editor's Paper Recommendations Knowledge Editing for Large Language Models: A Survey: Large language models (LLMs) have…

10 条评论

See all articles

Expanding Context Lengths in LLMs; Towards CausalGPT; Perplexity vs. Bard vs. GPT; Meet TinyLama; Leveraging qLoRA For Fine-Tuning; and More.

Danny Butvinik

Chief Data Scientist | 100K+ Followers | FinCrime | Writer | Author of AI Vanguard Newsletter

Editor's Paper Recommendations

Industry Insights

领英推荐

Growth Zone

The AI Vanguard

43,544 位关注者

Danny Butvinik的更多文章

社区洞察

其他会员也浏览了

Retrieval-Augmented Generation (RAG) and Agentic RAG

A philosophical perspective! Large Language Models can lead to general intelligence.

Retrieval Augmented Generation and?Beyond

Large Concept Models (LCMs): A New Paradigm in AI Language Processing

Natural Language Execution The new wave of AI with Bas van der Raadt

Beyond Text: The Rise of MultiModal Large Language Models (MM-LLMs)

Large Language Models

Introducing Kani (Sanskrit word): A Game-Changing Open-Source AI Framework for Language Models

Prompt Compression in Large Language Models

Editor's Paper Recommendations

Industry Insights

领英推荐

Growth Zone

The AI Vanguard

43,544 位关注者

Danny Butvinik的更多文章

Assessing GPT-4 on Reasoning; Mathematical Perspective On Transformers; Family Of Multimodal Models; Why Small LMs Are The Next Thing; and More.

First Hallucination-Free LLM; Fine-Tune or Retrieval; Privacy Issues in LLMs; New Embedding Model by Google; What Resilience Means and More.

LLM Fine-Tuning on Graphs; How To Evaluate LLMs; Uncovering Knowledge Gaps Using RAG; Claud 3 on Bedrock; Overcoming Limits Of RAG; and More.

Generation Model – What Do They Know? Cracking Length Generalization: AI's Reasoning Evolution; Can We Drastically Reduce Training Costs?; and More.

Multimodal LLMs; Orca 2; Cosmopedia – Largest Open Synthetic Data by Huggin Face; How To Fine-Tune On Single GPU; and More.

ChatGPT vs Gemini; Uncertainty Quantification in GenAI; GPT-4 vs. GPT-4V vs. Humans On Abstraction and Reasoning; Private vs Public LLMs; and More.

Survey on Hallucination in LLM; LLM’s Understanding Math; GPT4All Open-Source LMs; Next Chapter of Gemini; Improved GPT-4 Performance; and More.

Bard vs. ChatGPT; Jina Embedding 2; Text2Structure; Does GPT-4 Pass Turing Text?; Transformer As Graph2Graph; and More.

Hallucination in LLMs – Perspectives and Remediations; Fine-Tuning With Feedback; What LLMs DO NOT KNOW; LLaMA 2 Explained; and More.

What Algorithms Can Transformers Learn; Reasoning Agent for Graphs; Supervised Fine-Tuning; Context Understanding in LLMs; and More.

社区洞察

其他会员也浏览了

Retrieval-Augmented Generation (RAG) and Agentic RAG

A philosophical perspective! Large Language Models can lead to general intelligence.

Retrieval Augmented Generation and?Beyond

Large Concept Models (LCMs): A New Paradigm in AI Language Processing

Natural Language Execution The new wave of AI with Bas van der Raadt

Beyond Text: The Rise of MultiModal Large Language Models (MM-LLMs)

Large Language Models

Introducing Kani (Sanskrit word): A Game-Changing Open-Source AI Framework for Language Models

Prompt Compression in Large Language Models