Top LLM Papers of the Week (July Week 3, 2024)
[1] Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models
The authors release the Spectra LLM suite consisting of 54 language models ranging from 99M to 3.9B parameters, trained on 300B tokens. Spectra includes FloatLMs, post-training quantized QuantLMs (3, 4, 6, and 8 bits), and ternary LLMs (TriLMs). [Paper]
[2] Better RAG using Relevant Information Gain
The paper introduces a new method based on relevant information gain to improve retrieval augmented generation (RAG) for large language models (LLMs). The proposed approach achieves state-of-the-art performance on question-answering tasks from the Retrieval Augmented Generation Benchmark (RGB). [Paper]
[3] Context Embeddings for Efficient Answer Generation in RAG
The paper introduces a new context compression method called COCOM for RAG systems in large language models (LLMs). COCOM addresses the challenge of slower decoding times caused by extended inputs in RAG systems. COCOM offers flexible compression rates to balance decoding speed and answer quality. [Paper]
[4] Putting GPT-4o to the Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency
This study conducts a comprehensive evaluation of GPT-4o's capabilities across language, vision, speech, and multimodal tasks. Results show that GPT-4o performs well in language and reasoning, especially in few-shot learning scenarios, and demonstrates improvements in multimodal tasks compared to earlier models. However, the model shows some limitations with complex inputs, particularly in audio and vision tasks. [Paper]
[5] QWEN2 Technical Report
The paper introduces the Qwen2 series, a new set of large language and multimodal models ranging from 0.5 to 72 billion parameters. These models, including dense models and a Mixture-of-Experts model, outperform most previous open-weight models and compete with proprietary models across various benchmarks. Qwen2 also demonstrates strong multilingual capabilities, supporting about 30 languages. [Paper]
For NLP Research and NLP Project guidance, please check
领英推荐
[6] RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems
The paper introduces RAGBench, the first comprehensive benchmark dataset for evaluating Retrieval-Augmented Generation (RAG) systems. RAGBench contains 100,000 examples covering five industry-specific domains and various RAG task types, sourced from real-world industry corpora. [Paper]
[7] The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators
The paper introduces Alchemist, a novel approach to using large pretrained models for data annotation. Instead of directly querying these models for labels, Alchemist tasks them to generate programs that can produce labels. ALCHEmist is significantly cheaper than traditional API-based labeling, reducing costs by about 500 times. [Paper]
[8] Autonomous Prompt Engineering in Large Language Models
This paper introduces the Automatic Prompt Engineering Toolbox (APET), a novel system that enables GPT-4 to autonomously apply prompt engineering techniques. APET utilizes advanced strategies like Expert Prompting, Chain of Thought, and Tree of Thoughts to optimize prompts dynamically. [Paper]
[9] The Oscars of AI Theater: A Survey on Role-Playing with Language Models
This paper presents a comprehensive survey on role-playing with language models, tracing their development from early persona-based models to advanced character simulations using Large Language Models (LLMs). [Paper]
[10] Characterizing Prompt Compression Methods for Long Context Inference
This study provides a comprehensive comparison of various prompt compression methods for long context inference in language models. [Paper]
If you like this, do subscribe to the newsletter so that you won't miss reading interesting LLM papers.