登录查看更多内容

??Top ML Papers of the Week

DAIR.AI

Democratizing Artificial Intelligence Research, Education, and Technologies

发布日期: 2023年3月12日

This issue highlights the top ML Papers of the Week (Mar 6 - Mar 12).

1). PaLM-E - incorporates real-world continuous sensor modalities resulting in an embodied LM that performs tasks such as robotic manipulation planning, visual QA, and other embodied reasoning tasks. (paper | demo)

2). Prismer - a parameter-efficient vision-language model powered by an ensemble of domain experts; it efficiently pools expert knowledge from different domains and adapts it to various vision-language reasoning tasks. (paper | code)

3). Visual ChatGPT - it connects ChatGPT and different visual foundation models to enable users to interact with ChatGPT beyond language format. (paper | code)

4). A History of Generative AI - an overview of generative AI - from GAN to ChatGPT. (paper)

5). LLMs do In-Context Learning Differently - shows that with scale, LLMs can override semantic priors when presented with enough flipped labels; these models can also perform well when replacing targets with semantically-unrelated targets. (paper)

Bloomberg News 6 个月前

Artificial Intelligence #186

Andriy Burkov 1 年前

Artificial Intelligence #186

Andriy Burkov 1 年前

6). Foundation Models for Decision Making - provides an overview of foundation models for decision making, including tools, methods, and new research directions. (paper)?

7). Hyena Hierarchy - a subquadratic drop-in replacement for attention by interleaving implicit long convolutions and data-controlled gating; it can learn on sequences 10x longer and up to 100x faster than optimized attention. (paper | code)

8). OpenICL - a new open-source toolkit for in-context learning and LLM evaluation; supports various state-of-the-art retrieval and inference methods, tasks, and zero-/few-shot evaluation of LLMs. (paper | code)

9). MathPrompter - a technique that improves LLM performance on mathematical reasoning problems; it uses zero-shot chain-of-thought prompting and verification to ensure generated answers are accurate. (paper)

10). GigaGAN - a new architecture that enables scaling up GAN models to benefit from large datasets for text-to-image synthesis; it’s found to be orders of magnitude faster at inference time, can synthesize high-resolution images, and supports various latent space editing applications. (paper | demo)