登录查看更多内容

Top LLM Papers of the Week (March First Week 2024)

Kalyan KS

发布日期: 2024年3月10日

For video tutorials on top LLM papers, check my YouTube Channel.

[1] Not all Layers of LLMs are Necessary during Inference

The inference stage of LLMs being computationally expensive poses problems for real-time application use. During LLM inference, not every layer within an LLM is always actively used as per statistical analysis. AdaInfer is a new algorithm designed to decide when to stop inference depending on the input difficulty. Moreover, this algorithm doesn’t change LLM parameters and works across multiple tasks.

Tweet - - Summary - - Paper

[2] SaulLM-7B: A pioneering Large Language Model for Law

SaulLM-7B is a large language model (LLM) specifically designed to understand and generate legal text. It is based on the Mistral 7B LLM. SaulLM-7B was trained on a massive dataset of English legal documents (over 30 billion tokens). SaulLM-7B exhibits state-of-the-art proficiency in understanding and processing legal documents.

Tweet - - Summary - - Paper

[3] ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Existing pruning methods require extra information (like gradients) or complex. The authors proposed a new approach which involves removing less important layers based on new metric called Block Influence (BI).

Tweet - - Summary - - Paper

[4] GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Training Large Language Models (LLMs) presents significant memory challenges because of their large sizes. This paper introduces GaLore, a new memory-efficient LLM training method.

Tweet - - Summary - - Paper

[5] Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Chatbot Arena is a new open platform introduced to specifically address this evaluation challenge. For evaluation, this platform uses a pairwise comparison method, gathering human preferences through crowdsourcing. The platform gathered over 240K votes and has been successful.

Tweet - - Summary - - Paper

[6] Apollo: Lightweight Multilingual Medical LLMs towards Democratizing Medical AI to 6B People

While so much medical knowledge exists in English, delivering effective healthcare often requires local languages, especially in regions with fewer medical resources. Multilingual medical LLMs (Apollo) are being developed to improve healthcare access in regions with limited resources and non-English speakers.

Tweet - - Summary - - Paper

New Enterprise Associates (NEA) 4 个月前

Te Reo Māori revitalisation and adaption with AI

Dr Karaitiana Taiuru JP, MInstD 1 年前

Top LLM Papers of the Week (August Week 1, 2024)

Kalyan KS 3 个月前

[7] Birbal: An efficient 7B instruct-model fine-tuned with curated datasets

Birbal LLM is based on the Mistral-7B architecture and fine-tuned in 16 hours on a single RTX 4090 GPU. BirBal LLM outperformed the Qwen-14B model by a significant 35%. BirBal LLM’s success can be attributed to focused, high-quality instructions covering a wide range of tasks.

Tweet - - Summary - - Paper

[8] A Survey of AI-generated Text Forensic Systems: Detection, Attribution, and Characterization

Along with remarkable text generation capabilities, LLMs pose serious risks like facilitating the spread of propaganda, misinformation, and disinformation at an alarming scale. In response to these dangers, a new field is rapidly developing called “AI-generated text forensics”. This area includes tools and techniques to fight the potential misuse of LLMs.

Tweet - - Summary - - Paper

[9] LLMGuard: Guarding against Unsafe LLM Behavior

Sometimes, LLMs can generate inappropriate, biased, or factually incorrect responses. This might result in a violation of regulations and can lead to legal issues. LLMGuard is a tool which has the potential to address these LLM risks. LLMGuard can monitor user interactions with an LLM application and flags content against specific behaviours or conversation topics.

Tweet - - Summary - - Paper

[10] Data Augmentation using LLMs: Data Perspectives, Learning Paradigms and Challenges

Data Augmentation involves generating more labelled data to train deep learning models.Large Language Models can generate large amounts of realistic text data. This survey paper discusses the positive impact of LLMs on DA, including various strategies for using LLMs to generate new training data.

Tweet - - Summary - - Paper

If you like this, do subscribe to the newsletter so that you won't miss reading interesting LLM papers.

Are you are interested in learning LLM Prompting Engineering, here is an excellent book (free and available online to read)

Book link: https://github.com/AkmmusAI/LLM-Prompt-Engineering-Simplified-Book

Enjoy learning and using LLMs. See you in the next week with another set of interesting LLM papers.

Let me know in the comments which paper you find most interesting out of these ten papers and why.

Top LLM Papers of the Week

23,820 位关注者

JJ Delgado

9-figure Digital Businesses Maker based on technology (Web2, Web3, AI, and noCode) | General Manager MOVE Estrella Galicia Digital & exAmazon

8 个月

Your selection of papers is fascinating! Can't wait to dive in and explore more. ???? Kalyan KS

1 次回应

John Goliash

8 个月

Exciting lineup of papers! Can't wait to dive in and learn more about the latest research in large language models. ??

1 次回应

TOMEK

8 个月

Fascinating collection of papers – I'm particularly intrigued by "SaulLM-7B: A pioneering Large Language Model for Law," as it seems to have the potential to significantly impact legal research and accessibility.

2 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

Top LLM Papers of the Week (March First Week 2024)

Kalyan KS

领英推荐

Top LLM Papers of the Week

23,820 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

Krashen’s Monitor Model Part 1

Can Translators Use AI Without Risk Going to Jail?

Can we trust LLMs with translations?

The Path Forward with Sovereign LLMs

Embracing Spelling Erors as Tool for Linguistic Solidarity

Language Tech through Time: A Lookback at the Linguist’s Landscape

Supporting Document: Scientific Grounds for Linguistic Analysis and Quantitative Data Analysis

Microsoft's AI breakthrough will make it easier to communicate with anyone in the world

Some painful untold facts about AI-tools - explanation with forensic linguistic outlook

Conversing Across Time: The Linguistic Odyssey from the Oxford English Dictionary to GPT-4

领英推荐

Top LLM Papers of the Week

23,820 位关注者

Top RAG Papers of the Week (November Week 3, 2024)

2024年11月24日

?? Top LLM Papers of the Week (November Week 3, 2024)

2024年11月22日

Top RAG Papers of the Week (November Week 2, 2024)

2024年11月17日

Top LLM Papers of the Week (November Week 2, 2024)

2024年11月16日

Top RAG Papers of the Week (November Week 1, 2024)

2024年11月9日

Top LLM Papers of the Week (November Week 1, 2024)

2024年11月8日

Top LLM Papers of the Week (October Week 4, 2024)

2024年11月3日

Top RAG Papers of the Week (October Week 4, 2024)

2024年11月2日

Top RAG Papers of the Week (October Week 2, 2024)

2024年10月20日

Top LLM Papers of the Week (October Week 2, 2024)

2024年10月18日

社区洞察

其他会员也浏览了

Krashen’s Monitor Model Part 1

Can Translators Use AI Without Risk Going to Jail?

Can we trust LLMs with translations?

The Path Forward with Sovereign LLMs

Embracing Spelling Erors as Tool for Linguistic Solidarity

Language Tech through Time: A Lookback at the Linguist’s Landscape

Supporting Document: Scientific Grounds for Linguistic Analysis and Quantitative Data Analysis

Microsoft's AI breakthrough will make it easier to communicate with anyone in the world

Some painful untold facts about AI-tools - explanation with forensic linguistic outlook

Conversing Across Time: The Linguistic Odyssey from the Oxford English Dictionary to GPT-4