LLMs and False Promise of Creativity; LLMs as Optimizers; Running Thousands of LLMs on One GPU; 10 GPTs You Should Know; and More
Danny Butvinik
Chief Data Scientist | 100K+ Followers | FinCrime | Writer | Author of AI Vanguard Newsletter
Editor's Paper Recommendations
Art or Artifice? Large Language Models and the False Promise of Creativity: Researchers have argued that large language models (LLMs) exhibit high-quality writing capabilities, from blogs to stories. However, evaluating objectively the creativity of a piece of writing is challenging. Inspired by the Torrance Test of Creative Thinking (TTCT), which measures creativity as a process, we use the Consensual Assessment Technique [3] and propose the Torrance Test of Creative Writing (TTCW) to evaluate creativity as a product. TTCW consists of 14 binary tests organized into the original dimensions of Fluency, Flexibility, Originality, and Elaboration. We recruit 10 creative writers and implement a human assessment of 48 stories written by professional authors or LLMs using TTCW. Our analysis shows that LLM-generated stories pass 3-10 fewer TTCW tests than professional stories. In addition, we explore using LLMs as assessors to automate the TTCW evaluation, revealing that none of the LLMs positively correlate with the expert assessments.
Large Language Models as Optimizers: Optimization is ubiquitous. While derivative-based algorithms have been powerful tools for various problems, the absence of gradient challenges many real-world applications. In this work, we propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as optimizers, where the optimization task is described in natural language. In each optimization step, the LLM generates new solutions from the prompt that contains previously generated solutions with their values. The new solutions are evaluated and added to the prompt for the next optimization step. We first showcase OPRO on linear regression and traveling salesman problems, then move on to prompt optimization, where the goal is to find instructions that maximize task accuracy. With various LLMs, we demonstrate that the best prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K and up to 50% on Big-Bench Hard tasks.
A Practical Survey on Zero-shot Prompt Design for In-context Learning: The remarkable advancements in large language models (LLMs) have significantly improved Natural Language Processing(NLP) tasks. This paper comprehensively reviews in-context learning techniques, focusing on different types of prompts, including discrete, continuous, few-shot, and zero-shot, and their impact on LLM performance. We explore various approaches to prompt design, such as manual design, optimization algorithms, and evaluation methods, to optimize LLM performance across diverse tasks. Our review covers key research studies in prompt engineering, discussing their methodologies and contributions to the field. We also delve into the challenges faced in evaluating prompt performance, given the absence of a single "best" prompt and the importance of considering multiple metrics. In conclusion, the paper highlights the critical role of prompt design in harnessing the full potential of LLMs. It provides insights into the combination of manual design, optimization techniques, and rigorous evaluation for more effective and efficient use of LLMs in various NLP tasks.
--
Are you looking to advertise a product, job opening, or event to an audience of over 40,000 AI researchers and engineers? Get in touch with us on?LinkedIn?to explore your options.
Enjoy the newsletter? Help us make it bigger and better by sharing it with colleagues and friends.
--
Industry Insights
?Growth Zone
??
Expert Advice
Principal Data Scientist | AI | Data Science | Digital | Automation
1 年Glad to know about the LLM's optimizers
"Data scientist & Analyst | AI enthusiast - leveraging the power of Generative AI Tools like ChatGPT,Bard to drive business success" Love to explore more on #AI tools #Data Analysis #Data Science #music(smule).
1 年Very Glad to learn the latest advancements in LLM's Optimizers, GPT's . Thank you for sharing a much needed Newsletter in Current AI Era Danny Butvinik ??