登录查看更多内容

Bleeding-Edge Computer Vision: CVPR 2020

Roshan Ram

ml @ bland (YC23) | prev: ml @ boeing, apple, iqt | machine learning & information systems @ carnegie mellon '22 | 11-785 ta | nlp/rl research @ cmu robotics, lti, mld

发布日期: 2020年7月8日

By Roshan Ram

We’ve come a long way from edge detection — the process of finding edges in an image -- developed around the 70s. Actually, we’ve even come quite far from the renown AlexNet, the winning model of the famous ImageNet competition of 2012. We now live in a a world building scalable cloud computing architectures, developing and utilizing autonomous vehicles, and working with quantum supercomputing, with companies like Amazon, Tesla, and IBM at the forefront.

A few weeks ago, I attended the Computer Vision and Pattern Recognition conference of 2020 — better known as CVPR2020.

From Fireside Chats with Satya Nadella and Charlie Bell, to workshops regarding Explainable AI, to detailed presentations on Dynamic Fluid Surface Reconstruction with Deep Learning, CVPR2020 was a great experience.

---------------------------------------------------------------------------------------------------------------

My abridged takeaways:

? Generative Adversarial Networks (GANs) hold many possibilities due to their generative nature

? Some new GAN models/flavors include:

--------> ? Deep Convolutional GANs (DCGANs) — typically used in the generation of high quality images.

--------> ? ConditionalGANS (cGANs) — medium to high quality outputs, focus more on constraining and controlling the images.

--------> ? StackGANs — as per their namesake, comprised of several stacked networks. Can be used to generate realistic images based off of text (in a way, the reverse of image captioning with RNNs/LSTMs utilizing datasets like @Microsoft’s MS-COCO).

? Explainable AI can be advanced beyond primitive techniques — emerging techniques include extraction of feature maps/attributes through “CNN Circuits.”

---------------------------------------------------------------------------------------------------------------

A select few of my favorite papers and workshops from the Computer Vision and Pattern Recognition conference this year:

Mitigating Bias in Face Recognition Using Skewness-Aware Reinforcement Learning [Paper]

--------> ? Bias in datasets is certainly a prevalent issue, and creative approaches like the use of Generative Adversarial Networks and Reinforcement Learning to “correct” a dataset could have enormous implications on a multitude of issues.

Recent Advances in Vision-and-Language Research [Workshop]

--------> ? This workshop covered a variety of different contemporary topics, including advanced attention, ensemble models and variations on popular models such as bidirectional encoder representations (BERT), and language bias reduction.

Interpretable Machine Learning from Computer Vision [Workshop]

--------> ? This workshop worked on breaking into the black-box of advanced deep learning architectures such as deep CNNs and recursive neural networks to understand the seemingly elusive logic behind object/scene recognitions, image captioning, and visual question answering.

---------------------------------------------------------------------------------------------------------------

What new tools, heuristics, or insights have you come across in the fields of machine learning and computer vision? Do you agree/disagree with any of the above? Do you see certain methodologies distinctly rising to the foreground and others retreating in effectiveness? Don’t hesitate to comment or connect — I’m always eager to learn and discuss!

Roshan Ram的更多文章

Part 1: Google Applied ML Summit | Takeaway Series

2021年6月20日

Part 1: Google Applied ML Summit | Takeaway Series

This will be the first of a 3-part series on the 3 tracks provided in the Google Applied ML Summit. Stay tuned for the…

Bleeding-Edge Computer Vision: CVPR 2020

Roshan Ram

ml @ bland (YC23) | prev: ml @ boeing, apple, iqt | machine learning & information systems @ carnegie mellon '22 | 11-785 ta | nlp/rl research @ cmu robotics, lti, mld

By Roshan Ram

? Generative Adversarial Networks (GANs) hold many possibilities due to their generative nature

? Some new GAN models/flavors include:

? Explainable AI can be advanced beyond primitive techniques — emerging techniques include extraction of feature maps/attributes through “CNN Circuits.”

Roshan Ram的更多文章

社区洞察

其他会员也浏览了

Responsibility at the Core

Artificial Intelligence #91

#AI The rise of DeepFakes and the pending attack on our social structure

Introducing YOLO-NAS, An Object Detection Foundation Model Generated by Deci's NAS Engine

Titans vs. DeepSeek: Overthrowing the AI Landscape

Artificial Intelligence in Machine Vision Market Ready To Fly on high Growth Trends | IFLYTEK, NavInfo, NVIDIA

Your Daily AI Research tl;dr - 2022-07-26 ??

Stable Diffusion Phenomenon: from core principles to real-world applications

A Glimpse into the Past: Retrospective Forecasts on AI and Computer Hardware

By Roshan Ram

? Generative Adversarial Networks (GANs) hold many possibilities due to their generative nature

? Some new GAN models/flavors include:

? Explainable AI can be advanced beyond primitive techniques — emerging techniques include extraction of feature maps/attributes through “CNN Circuits.”

Roshan Ram的更多文章

Part 1: Google Applied ML Summit | Takeaway Series

社区洞察

其他会员也浏览了

Responsibility at the Core

Artificial Intelligence #91

#AI The rise of DeepFakes and the pending attack on our social structure

Introducing YOLO-NAS, An Object Detection Foundation Model Generated by Deci's NAS Engine

Titans vs. DeepSeek: Overthrowing the AI Landscape

Artificial Intelligence in Machine Vision Market Ready To Fly on high Growth Trends | IFLYTEK, NavInfo, NVIDIA

Your Daily AI Research tl;dr - 2022-07-26 ??

Stable Diffusion Phenomenon: from core principles to real-world applications

A Glimpse into the Past: Retrospective Forecasts on AI and Computer Hardware