登录查看更多内容

Facebook's AI at Work Produces ConViT

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

发布日期: 2021年7月25日

Facebook AI says its developed a new computer vision model called ConViT, which combines two widely used AI architectures — convolutional neural networks (CNNs) and Transformer-based models — in order to overcome some important limitations of each approach on its own.?

‘ConViT’, A Computer Vision Model That Improves Vision Transformers (ViT) With Soft Convolutional Inductive Biases

By leveraging both techniques, this vision Transformer-based model can outperform existing architectures, especially in the low data regime, while achieving similar performance in the large data setting.

While AI is still very biased, AI Researchers have been utilizing certain assumptions (inductive biases) that help to train machine learning models.

You can read their July, 2021 paper here.

CNNs, which have proved extremely successful for vision tasks, rely on two of these inductive biases built into the architecture itself: that pixels near one another are related (locality) and that different portions of an image should be processed identically regardless of their absolute location (weight sharing).

Facebook's ConViT could be important in how Vision Transformers (ViTs) rely on more flexible self-attention layers, and have recently outperformed CNNs for image classification.

ConViT is a New Computer Vision Model

The Facebook AI Research team has developed a new computer vision model called?ConViT. The ConViT system combines two widely used architectures to overcome some important limitations of each approach on its own, namely convolutional neural networks (CNNs) and Transformer-based models.?

The resulting convolutional-like ViT architecture, ConViT, outperforms the DeiT on ImageNet, while offering a much improved sample efficiency.?

领英推荐

What is AI

Gopi Raghavendra 2 年前

The Architect’s Guide to Understanding Agentic AI

MinIO 1 个月前

?? Summer of AI: Silicon Valley Heating Up??

Alicia Colmenero Fernández 8 个月前

The goal of ConViT was to modify vision Transformers to encourage their networks act convolutionally. They introduced a soft inductive bias that allows the network model itself whether it wants to remain convolutional or not.

They did so by introducing gated positional self-attention (GPSA), where the model learns parameters that control how much standard content-based attention is used compared with an initialized position-based one.

Github: https://github.com/facebookresearch/convit

Paper: https://arxiv.org/pdf/2103.10697.pdf

While the future of Facebook is likely the Facebook Metaverse, Facebook AI will have to get moving if its wants to challenge ByteDance (Beijing's alternative to Facebook) in the future of how the metaverse actually works.

While the current AI is hopelessly biased for various reasons, the researchers hope that their ConViT approach will encourage the community to explore other ways of moving from hard inductive biases to soft inductive ones.

The success of deep learning over the last decade has largely been fueled by models with strong inductive biases, allowing efficient training across domains. What will the deep learning of the future bring?

Facebook is Hyping the Metaverse

Facebook has now successful demonstrated bridging the gap between CNNs and Transformers, by presenting a new method to “softly” introduce a convolutional inductive bias into the ViT.

The Metaverse thanks you for your attention. Zuckerberg (who now holds more centralized power at Facebook since the pandemic) envisions a metaverse as an all-encompassing online world where people can game, work and communicate in a virtual environment, often using VR headsets that has up until now largely been depicted in science fiction movies, not reality.

Artificial Intelligence Report

242,817 位关注者

Nina Palolahti

??International Business Law /Business Management #MBA #LLM #IBL

3 年

Sounds Interesting! Is there any precise data how it might affect people’s interaction and behavior on the platform?:)

Udayanathan Vettikkat

CMO (Global) & Head-Global Channels, Investor, Analyst, Public Relations at SEQURETEK. 38 years of Marketing, Sales and Technology Leadership experience @Cisco, IBM, HCL Tech, Novell, NTT Netmagic, Star TV, CSS Corp

3 年

Wow

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

3 年

The concentration of $Trillion market caps and AI talent is actually very dangerous for the future of technology as it creates an unbalanced wealth inequality and makes the actions of central bank stimulus less effective as the U.S. stock market is basically weighted too much in just a few companies - who consolidate too much of the extra added liquidity. It turns out China understand what antitrust regulation actually means, not America, this not only creates an innovation bottleneck for the U.S. but serious economic issues for how unbalanced capitalism is becoming. Data capitalism it turns out is risky for the entire system even as these duopolies want to maintain their surveillance capitalism dominance - the more dominant they become - the worse it is for everyone.

1 次回应

Jyotsna Sharma

Product Engineering/Management: Samsung/ex-Motorola/Application Development/Health/Security

3 年

I believe its good breakthrough. Though I am still not convinced with idea of always wearing VR headsets. May be more innovation to make VR glasses available as simple lens would help!

Randall Hoogerhyde

CAD Senior Software Engineer/Project Manager | Windows/Web Software Developer | Full Stack MCPD Consultant - DISABLED | Retired 2013

3 年

Good luck with all that. Just keep trying abd see where you get. You never know.

查看更多评论

要查看或添加评论，请登录

Michael Spencer的更多文章

The Fundamental Lie of OpenAI's Mission

2025年3月20日

The Fundamental Lie of OpenAI's Mission

Welcome Back, Everyone from OpenAI to DeepSeek claims they are an AGI startup, but the way these AI startups are…

13 条评论
Vibe Coding: Revolution or Regression Students and Non-coders?

2025年3月19日

Vibe Coding: Revolution or Regression Students and Non-coders?

Good Morning, As the vibe coding interface takes shape, I’ve been checking out a new startup coming out of stealth this…

9 条评论
The Truth about DeepSeek's Integration in China and WeChat Explained

2025年3月18日

The Truth about DeepSeek's Integration in China and WeChat Explained

DeepSeek's rapid integration in China is a bigger story that is being told. It's not just the China Cloud leaders…

4 条评论
How AI Datacenters Work

2025年3月13日

How AI Datacenters Work

Good Morning, Get the full inside scoop on key AI topics for less than $2 a week with a premium subscription to my…

5 条评论
How Nvidia is down 30% from its Highs

2025年3月12日

How Nvidia is down 30% from its Highs

If like me, you are wondering why Nvidia is down more than 20% this year even when the demand is still raging for AI…

8 条评论
What DeepSeek Means for AI Innovation

2025年3月10日

What DeepSeek Means for AI Innovation

Welcome to another article by Artificial Intelligence Report. LinkedIn has started to "downgrade" my work.

16 条评论
What is Vibe Coding?

2025年3月5日

What is Vibe Coding?

Good Morning, Get access to my best and complete work for less than $2 a week with premium access. I’m noticing two…

23 条评论
TSMC "kisses the Ring" in Trump Chip Fab Announcement

2025年3月4日

TSMC "kisses the Ring" in Trump Chip Fab Announcement

Good Morning, To get the best of my content, for less than $2 a week become a premium subscriber. In the history of the…

9 条评论
GPT-4.5 is Not a Frontier Model

2025年3月3日

GPT-4.5 is Not a Frontier Model

To get my best content for less than $2 a week, subscribe here. Guys, we have to talk! OpenAI in the big picture is a…

16 条评论
On why LLMs cannot truly reason

2025年2月28日

On why LLMs cannot truly reason

?? In partnership with HubSpot ?? HubSpot Integrate tools on HubSpot The HubSpot Developer Platform allows thousands of…

3 条评论

See all articles

Facebook's AI at Work Produces ConViT

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

‘ConViT’, A Computer Vision Model That Improves Vision Transformers (ViT) With Soft Convolutional Inductive Biases

ConViT is a New Computer Vision Model

领英推荐

Facebook is Hyping the Metaverse

Artificial Intelligence Report

242,817 位关注者

Michael Spencer的更多文章

社区洞察

其他会员也浏览了

Transforming AI

Google's Gemini Pro 0801 Outpaces GPT-4o, AI Reveals Hidden Heart Attack Risks, and Lab-Created Black Hole Shines Bright in Scientific Breakthrough

Beyond Brute Force: Rethinking AI’s Path to True Intelligence—Thinking, Reasoning, and Creating New Knowledge

Making Sense of the Data & Your AI Strategy!

Explaining Humans to AI - the new paradigm of Collaborative Intelligence

Addressing AI and ML bias with TCAV technology

Why AI Now: Unpacking the Driving Forces Behind the Artificial Intelligence Explosion

Artificial Intelligence: the triple revolution

Artificial Intelligence: The Triple Revolution

Neuro Networks, Symbolic AI, and Combined Strengths

‘ConViT’, A Computer Vision Model That Improves Vision Transformers (ViT) With Soft Convolutional Inductive Biases

ConViT is a New Computer Vision Model

领英推荐

Facebook is Hyping the Metaverse

Artificial Intelligence Report

242,817 位关注者

Michael Spencer的更多文章

The Fundamental Lie of OpenAI's Mission

Vibe Coding: Revolution or Regression Students and Non-coders?

The Truth about DeepSeek's Integration in China and WeChat Explained

How AI Datacenters Work

How Nvidia is down 30% from its Highs

What DeepSeek Means for AI Innovation

What is Vibe Coding?

TSMC "kisses the Ring" in Trump Chip Fab Announcement

GPT-4.5 is Not a Frontier Model

On why LLMs cannot truly reason

社区洞察

其他会员也浏览了

Transforming AI

Google's Gemini Pro 0801 Outpaces GPT-4o, AI Reveals Hidden Heart Attack Risks, and Lab-Created Black Hole Shines Bright in Scientific Breakthrough

Beyond Brute Force: Rethinking AI’s Path to True Intelligence—Thinking, Reasoning, and Creating New Knowledge

Making Sense of the Data & Your AI Strategy!

Explaining Humans to AI - the new paradigm of Collaborative Intelligence

Addressing AI and ML bias with TCAV technology

Why AI Now: Unpacking the Driving Forces Behind the Artificial Intelligence Explosion

Artificial Intelligence: the triple revolution

Artificial Intelligence: The Triple Revolution

Neuro Networks, Symbolic AI, and Combined Strengths