登录查看更多内容

Your Daily AI Research tl;dr | 2022-06-02

What's AI by Louis-Fran?ois Bouchard

Artificial Intelligence clearly explained to everyone

发布日期: 2022年6月2日

Welcome to your official daily AI research tl;dr (and news) intended for AI professionals and enthusiasts.

In this newsletter, I share the most exciting papers I find on a daily basis, along with a short summary to help you quickly seize if the paper is worth investigating. I will also take this opportunity to share daily interesting news in the field. I hope you enjoy the format of this newsletter, and I would gladly take any feedback you have in the comments to improve it.

Now, let's get started with this iteration!

1?? CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers

We've first seen large language models create text, such as GPT-3. Then, similar Transformer-based architectures were adapted to images, yielding a lot of progress with text-to-image, especially recently with Dall-E 2 and Imagen. Now, let's jump to the next logical step: text-to-video. Generating videos isn't just adding a time dimension to the images and scorching them together. Each frame needs to be perfectly sequential and make sense, respecting physics and our world's laws. Not only that it needs to stay coherent physically, but it also needs to be aligned with the input text. This task is extremely difficult even for humans, so imagine for a machine.

Side note regarding "human-created videos": I recommend reading Creativity Inc. by Pixar's CEO. A great and really insightful read.

In this paper, Wenyi Hong et al. tackle this text-to-video task using once again a similar large Transformer-based model with impressive results called CogVideo. "CogVideo outperforms all publicly available models at a large margin in machine and human evaluations."

Link to the paper: https://arxiv.org/pdf/2205.15868.pdf

Code: https://github.com/THUDM/CogVideo

2?? CYCLIP: Cyclic Contrastive Language-Image Pretraining?

This new paper from UCLA and Adobe Research suggests that the image and text encodings made by CLIP may lead to inconsistent downstream predictions and are not interchangeable, which is bad news for all CLIP-based applications. And we know there are many of them.

They introduce CyCLIP, "a framework for contrastive representation learning that explicitly optimizes for the learned representations to be geometrically consistent in the image and text space."

From the abstract: "we show that the improved consistency in CYCLIP translates to significant gains over CLIP, with gains ranging from 10% ? 24% for zero-shot classification accuracy on standard benchmarks (CIFAR-10, CIFAR-100, ImageNet1K) and 10% ? 27% for robustness to various natural distribution shifts."

领英推荐

Your introduction to all things AI

MIT Technology Review 4 个月前

What is DeepSeek? Understanding the Impact of This…

Saletancy 1 个月前

Geneea's AI Spotlight #8

Geneea 1 年前

Link to the paper: https://arxiv.org/pdf/2205.14459.pdf

Code: https://github.com/goel-shashank/CyCLIP

?? The 5 Best AI Articles of the Month ! ??

In this iteration of the weekly newsletter, we are diving into five amazing articles written by the AI community! ??

With a great commentary by Lauren Keegan, as always! ??

Most of them come from people exchanging daily with us on the Towards AI discord community, and we would love to see more creative people join us and share their pieces. If you work with AI, blogger, YouTuber, coder, or simply learning AI, consider joining the Learn AI Together Discord server! ?? ??

Or watch the video here...

And we are already at the end of this iteration! Please subscribe and share it with your techy friends if you've enjoyed it. Once again, let me know how to improve this format as this is something I have wanted to do for quite some time and haven't figured out the best way to do so. I hope you liked the decisions here, and I would be glad to hear from you to make it even better with time.

Thank you for reading, a fellow AI enthusiast and researcher.

Your Daily AI Research tl;dr | 2022-06-02

What's AI by Louis-Fran?ois Bouchard

Artificial Intelligence clearly explained to everyone

1?? CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers

2?? CYCLIP: Cyclic Contrastive Language-Image Pretraining?

领英推荐

?? The 5 Best AI Articles of the Month ! ??

Five Minutes of AI

1,866 位关注者

What's AI by Louis-Fran?ois Bouchard的更多文章

社区洞察

其他会员也浏览了

Why Niche LLMs are the Next Big Thing

A leader's guide to GenAI technicals - What is AI

Weekly Artificial Intelligence Newsletter

The AI Horizon: GPT4o, Gemini, Mustafa Suleyman and Meredith Whittaker

#7: Daniela and Dario Amodei - royal family of AI

Your Daily AI Research tl;dr | 2022-05-31

Is AI Really as Smart as We Think? Breaking Down AI's Limitations

Why AI Hallucinations Persist: Exploring the Limits of Even the Best AI Models

Episode #3 - AI Weekly: by Aruna

1?? CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers

2?? CYCLIP: Cyclic Contrastive Language-Image Pretraining?

领英推荐

?? The 5 Best AI Articles of the Month ! ??

Five Minutes of AI

1,866 位关注者

What's AI by Louis-Fran?ois Bouchard的更多文章

Five Minutes of AI - Issue #119

Five Minutes of AI - Issue #118

Five Minutes of AI - Issue #117

Five Minutes of AI - Issue #116

Five Minutes of AI - Issue #115

Five Minutes of AI - Issue #114

Five Minutes of AI - Issue #113

Five Minutes of AI - Issue #112

Five Minutes of AI - Issue #111

Five Minutes of AI - Issue #110

社区洞察

其他会员也浏览了

Why Niche LLMs are the Next Big Thing

A leader's guide to GenAI technicals - What is AI

Weekly Artificial Intelligence Newsletter

The AI Horizon: GPT4o, Gemini, Mustafa Suleyman and Meredith Whittaker

#7: Daniela and Dario Amodei - royal family of AI

Your Daily AI Research tl;dr | 2022-05-31

Is AI Really as Smart as We Think? Breaking Down AI's Limitations

Why AI Hallucinations Persist: Exploring the Limits of Even the Best AI Models

Episode #3 - AI Weekly: by Aruna