登录查看更多内容

Take a deep breath applies to LLMs as well

Sudhir Gajre

Helping enterprises adopt Generative AI

发布日期: 2023年10月17日

I recently reviewed the academic paper 'LARGE LANGUAGE MODELS AS OPTIMIZERS', published by Google DeepMind.

?The paper discusses "Optimization by PROmpting (OPRO)", a proposed method for utilizing large language models (LLMs) to overcome challenges posed by the absence of gradients in various optimization problems. In OPRO, optimization tasks are described in natural language and LLMs generate new solutions during each optimization step based on a prompt containing previously generated solutions and their values. The new solutions are then evaluated and added to the prompt for subsequent steps. The authors demonstrated the effectiveness of OPRO in linear regression and traveling salesman problems, and also in optimizing prompts to maximize task accuracy, with results showing OPRO-optimized prompts outperforming those designed by humans.

?Large Language Models (LLMs) are beneficial for optimization tasks due to their capacity to understand natural language. This allows individuals to describe their optimization tasks informally, without needing formal specifications. An example is prompt optimization, where the objective is to find a prompt that maximizes task accuracy; this can be done by providing a high-level text summary accompanied by input-output examples. This natural language capability makes LLMs user-friendly and accessible for optimization tasks.

?Few excerpts from the paper:

?Benchmarks: The primary evaluation benchmarks are GSM8K (Cobbe et al., 2021) and Big-Bench Hard (BBH) (Suzgun et al., 2022). GSM8K is a benchmark of grade school math word problems with 7,473 training samples and 1,319 test samples, where chain-of-thought prompting (Wei et al., 2022) and the zero-shot instruction “Let’s think step by step.” (Kojima et al., 2022) have drastically improved the performance over the standard prompting. BBH is a suite of 23 challenging BIG-Bench tasks (Srivastava et al., 2022) that covers a wide range of topics beyond arithmetic reasoning, including symbolic manipulation and commonsense reasoning. Each task contains up to 250 examples in total.

?Implementation details: We set the temperature to be 0 when evaluating the performance of generated instructions, in which case the scorer LLM greedily decodes. Unless otherwise specified, we set the default temperature to be 1.0 for optimizer LLMs to generate diverse and creative instructions. At each optimization step, we prompt the optimizer LLM with the meta-prompt 8 times to generate 8 instructions, then we add these instructions with their training scores to the optimization trajectory in the meta-prompt. Our meta-prompt at each step contains the best 20 instructions so far and 3 randomly picked exemplars from the training set. We study the effect of different hyperparameters in ablation studies (Section 5.3). Appendix C.2 presents the full meta-prompts for different optimizer LLMs.

Thomas Wolf 3 个月前

Almost Timely News: How Large Language Models Are…

Christopher Penn 1 年前

? Time for LLMs?

Pascal Biese 10 个月前

?Key Findings: "Take a deep breath and work on this problem step-by-step" is the most impactful top instruction for the PaLM-2 model.??Use this carefully, there is no guarantee this will work on every LLM model.?In some cases, I have found this to improve accuracy on the OpenAI GPT models.?

?From academia to the corporate world:

The key takeaway is that LLMs are complex and their inner workings can be hard to conceptualize at times. They are auto-regressive, meaning they generate sequences of text one token at a time by conditioning each new token on the previous ones. Therefore, learning prompting strategies, iterating, and experimenting is absolutely key. Empower your employees by providing the right tools and environment for them to iterate and experiment, create their own learnings and best practices, and share them.

?Reach out if you have questions.?

#GPT #AI #GenerativeAI

要查看或添加评论，请登录

查看全部

Take a deep breath applies to LLMs as well

Sudhir Gajre

Helping enterprises adopt Generative AI

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

?? Getting RAG Right: All in One Go

Does Fine-Tuning cause more Hallucinations, and how does cross-layer Attention reduce Key-Value Cache size?

The Origination of Eight Major Methods For FineTuning an LLM

Enhancing Reasoning in Transformer-Based Large Language Models via Symbolic Templates

Top LLM Papers of the Week (March Week-3 2024)

Curious Language Model Limitations

How exactly LLM generates text?

Understanding CoALA (Cognitive Architectures for Language Agents) Through a ReAct Agent Example Using LangChain

Product problem considerations when building Large Language Model based applications

Faithful Logical Reasoning- Symbolic Chain-of-Thought & GNN-RAG - Graph Neural Retrieval for Large Language Model Reasoning

领英推荐

AI as Innovator and Collaborator: Insights from the 2024 Nobel Prizes in Physics and Chemistry

2024年10月22日

Why GenAI Is Not Optional in Today's Environment

2024年9月10日

You are no longer the center of my universe

2024年8月13日

How to Fail with GenAI Projects

2024年7月1日

Unlocking the Potential of Large Language Models: Beyond Just Words

2024年3月25日

Real world experiences: Enhancing LLM Accuracy through Prompt Alignment and Objective Evaluation

2024年2月27日

This is very important to my career

2023年11月14日

The Emergent Phenomenon in Large Language Models

2023年10月31日

Do LLMs Really Reason, or Just Recite?

2023年10月10日

Advanced Intelligence Platform

2023年10月3日