To CoT or not to CoT? Which tasks does Chain-of-thought (CoT) prompting benefit the most? In this paper, the authors show that CoT provides strong performance benefits primarily on tasks involving math or logic, with much smaller gains on other types of tasks. CoT can thus be applied selectively and cost-effectively, and new paradigms that leverage intermediate computation should continue to be developed for LLM applications. The authors are here to discuss their work. Leave your thoughts here! https://lnkd.in/ghTtwU3W Zayne Sprague Fangcong Yin @Juan Diego R. Dongwei Jiang Manya Wadhwa Prasann Singhal @Xinyu Zhao Xi Ye Kyle Mahowald Greg Durrett
alphaXiv的动态
最相关的动态
-
LLMs will always hallucinate, and we have to learn to live with it. According to this paper: https://lnkd.in/dd3pNnH4
LLMs Will Always Hallucinate, and We Need to Live With This
arxiv.org
要查看或添加评论,请登录
-
Adding this one to my reading list!
Very excited about a paper (to appear) in POPL 2025 with Margus Veanes and colleagues: https://lnkd.in/gMPNyZYQ. This brings together work on symbolic automata and derivatives, extended regular expressions, LTL, and last but not least, satisfiability modulo theories. It has been a while since my last POPL paper (2003). Looking forward to visiting sunny Denver in January!
Symbolic Automata: omega-Regularity Modulo Theories (POPL 2025 - POPL Research Papers) - POPL 2025
popl25.sigplan.org
要查看或添加评论,请登录
-
For a clear and accessible introduction to LLM fine-tuning with Low Rank Adaptation (LoRA), don't miss Matthew Gunton's latest paper walkthrough.
Understanding Low Rank Adaptation (LoRA) in Fine Tuning LLMs
towardsdatascience.com
要查看或添加评论,请登录
-
What does Entropy have to do with LLMs? Entropy is a measure of uncertainty in a system. In information theory, entropy measures the uncertainty of a probability distribution. For LLMs, entropy is a crucial concept as it quantifies the amount of information in a language model's output. By minimizing entropy, LLMs can generate more coherent and informative text. There are two mathematical concepts that I'd like to share here: 1?? KL-Divergence: The Kullback-Leibler divergence is a measure of the difference between two probability distributions. Minimizing the KL-divergence between the model's output distribution and the true distribution of the language is a key factor. It can lead to more accurate LLMs. 2?? Mutual Information: Mutual Information measures the dependence between two random variables. In the context of LLMs, it can be used to quantify the amount of information shared between the input and output. Optimizing mutual information can improve the relevance and contextual understanding of LLMs.
要查看或添加评论,请登录
-
All about RAG for extracting the best LLM output.
From Concept to Real-World Impact: How RAG is Revolutionizing Large Language Models
neuromentor.medium.com
要查看或添加评论,请登录
-
Duality in 19th and 20th Century Mathematical Thinking is the 10th book on the October-December reading list (and probably another few months due to its size of almost 1,000 pages). I originally considered it a birthday present but couldn't order it then due to traveling. Finally, a few weeks ago I applied the 50% discount. I have been interested in duality for the last 15 years, and it influenced a few memory dump, and trace and log analysis patterns. Finally, I have some organized and summarized material to digest. The last 100 pages are about the historical perspective on duality and category theory, so I was tempted to start reading from there. Still, I will start from the beginning to appreciate previous contexts. I also have the previous book from one of the editors - Tool and Object: A History and Philosophy of Category Theory.
要查看或添加评论,请登录
-
New preprint out: "LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs?", joined effort with my co-authors Jakub ?imko and Peter Brusilovsky ? LLM-based augmentation methods leveraging the new generation of LLMs have become very popular in the last few years ? Some previous studies compared these newer LLM-based augmentation methods with older, established augmentation methods in terms of performance with conflicting results ? We went for a finer methodology when comparing the two and also considered the costs (CO2 emitted, monetary and time costs) of these newer LLM-based methods ? Main finding: The benefit of using costly LLM-based text augmentation methods instead of much cheaper established methods diminishes with more seed samples used. Using LLM-based text augmentation seems to be justified only in cases where small (less than 20) seed samples per label are used. More exciting papers to come at Kempelen Institute of Intelligent Technologies Checkout the paper at:
LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs?
arxiv.org
要查看或添加评论,请登录
-
Reasoning is one of the greatest challenges in LLM. The current datasets just propose one path to solve a problem, a Chain-of-Thought (CoT). Recently, a new approach is proposed named "REFT" that expands CoT to increase the reasoning ability of LLM. The schema has two phases: 1- Simply train the LLM on inputs like this (query: as input, CoT: as output, final answer: as output) i.e. by giving a query, we expect the LLM generates CoT and the final answer. This procedure has been done in limited epochs (we do not want to fit our model in this phase). 2- A new RL problem is defined that the input of the policy is the query and the policy should generate the answer at the final output. If the generated answer is 1 else 0. To avoid sparsity of the reward function, if the wrong answer is generated by the policy, the reward value is 0.1. For more details: https://lnkd.in/dn_H6NqV #LLM #Reasoning #RL #paper_in_a_minute
ReFT: Reasoning with Reinforced Fine-Tuning
arxiv.org
要查看或添加评论,请登录
-
In this episode, we discuss Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer? by Nishant Balepur, Feng Gu, Abhilasha Ravichander, Shi Feng, Jordan Boyd-Graber, Rachel Rudinger. The paper investigates the reverse question answering (RQA) task where a question is generated based on a given answer and examines how 16 large language models (LLMs) perform on this task compared to traditional question answering (QA). The study reveals that LLMs are less accurate in RQA for numerical answers but perform better with textual ones, and they often can answer their incorrectly generated questions accurately in traditional QA, indicating that errors are not solely due to knowledge gaps. Findings also highlight that RQA errors correlate with question difficulty and are inversely related to the frequency of answers in the data corpus, presenting challenges in generating valid multi-hop questions and suggesting areas for improvement in LLM reasoning for RQA.
Arxiv Paper - Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can’t Answer?
podbean.com
要查看或添加评论,请登录
-
The D language has a library function schwartzSort that implements the Schwartzian Transform: https://lnkd.in/gHisPWNf
std.algorithm.sorting
dlang.org
要查看或添加评论,请登录
Stanford University
4 周I CoT therefore I am