ML Papers of The Week (Jan 1-8)
ML Papers of The Week

ML Papers of The Week (Jan 1-8)

Welcome to the first edition of the ML Papers of the Week where we highlight top machine learning papers every week. Below are the top papers of last week (January 1-8):

No alt text provided for this image
Source: https://arxiv.org/abs/2301.00704

1) Muse: Text-To-Image Generation via Masked Generative Transformers

GoogleAI introduces Muse, a new text-to-image generation model based on masked generative transformers; significantly more efficient than other diffusion models like Imagen and DALLE-2. Paper,?Project,?Code,?Tweet

2) VALL-E Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

Microsoft introduces VALL-E, a text-to-audio model that performs state-of-the-art zero-shot performance; the text-to-speech synthesis task is treated as a conditional language modeling task. Paper, Project,?Tweet

3) Rethinking with Retrieval: Faithful Large Language Model Inference

A new paper shows the potential of enhancing LLMs by retrieving relevant external knowledge based on decomposed reasoning steps obtained through chain-of-thought prompting. Paper,?Tweet

4) SparseGPT: Massive Language Models Can Be Accurately Pruned In One-Shot

Presents a technique for compressing large language models while not sacrificing performance; "pruned to at least 50% sparsity in one-shot, without any retraining." Paper,?Tweet

No alt text provided for this image
Performance of ConvNeXt across a wide range of model sizes. Source: https://arxiv.org/abs/2301.00808

5) ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

ConvNeXt V2 is a performant model based on a fully convolutional masked autoencoder framework and other architectural improvements. CNNs are sticking back! Paper,?Code,?Tweet

6) Large Language Models as Corporate Lobbyists

With more capabilities, we are starting to see a wider range of applications with LLMs. This paper utilized large language models for conducting corporate lobbying activities. Paper,?Code,?Tweet

7) Superposition, Memorization, and Double Descent

This work aims to better understand how deep learning models overfit or memorize examples; interesting phenomena observed; important work toward a mechanistic theory of memorization. Paper,?Tweet

No alt text provided for this image
Source: https://arxiv.org/abs/2301.01947

8) StitchNet: Composing Neural Networks from Pre-Trained Fragments

This work proposes to build new coherent neural networks by reusing pretrained fragments of existing NNs. Not straightforward but there is potential in terms of efficiently reusing learned knowledge in pre-trained networks for complex tasks. Paper,?Tweet

9) Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes?

Proposes integrated decomposition, an approach to improve Science Q&A through a human-in-the-loop workflow for refining compositional LM programs. Paper,?Code?Tweet

10) A Succinct Summary of Reinforcement Learning

This a nice little overview of some important ideas in RL. Paper,?Tweet

---

We also created a repo that contains the full list of papers.

Follow us at DAIR.AI to catch the top ML papers of next week.

要查看或添加评论,请登录

DAIR.AI的更多文章

社区洞察

其他会员也浏览了