Bleeding-Edge Computer Vision: CVPR 2020
Roshan Ram
ml @ bland (YC23) | prev: ml @ boeing, apple, iqt | machine learning & information systems @ carnegie mellon '22 | 11-785 ta | nlp/rl research @ cmu robotics, lti, mld
By Roshan Ram
We’ve come a long way from edge detection — the process of finding edges in an image -- developed around the 70s. Actually, we’ve even come quite far from the renown AlexNet, the winning model of the famous ImageNet competition of 2012. We now live in a a world building scalable cloud computing architectures, developing and utilizing autonomous vehicles, and working with quantum supercomputing, with companies like Amazon, Tesla, and IBM at the forefront.
A few weeks ago, I attended the Computer Vision and Pattern Recognition conference of 2020 — better known as CVPR2020.
From Fireside Chats with Satya Nadella and Charlie Bell, to workshops regarding Explainable AI, to detailed presentations on Dynamic Fluid Surface Reconstruction with Deep Learning, CVPR2020 was a great experience.
---------------------------------------------------------------------------------------------------------------
My abridged takeaways:
? Generative Adversarial Networks (GANs) hold many possibilities due to their generative nature
? Some new GAN models/flavors include:
--------> ? Deep Convolutional GANs (DCGANs) — typically used in the generation of high quality images.
--------> ? ConditionalGANS (cGANs) — medium to high quality outputs, focus more on constraining and controlling the images.
--------> ? StackGANs — as per their namesake, comprised of several stacked networks. Can be used to generate realistic images based off of text (in a way, the reverse of image captioning with RNNs/LSTMs utilizing datasets like @Microsoft’s MS-COCO).
? Explainable AI can be advanced beyond primitive techniques — emerging techniques include extraction of feature maps/attributes through “CNN Circuits.”
---------------------------------------------------------------------------------------------------------------
A select few of my favorite papers and workshops from the Computer Vision and Pattern Recognition conference this year:
- Mitigating Bias in Face Recognition Using Skewness-Aware Reinforcement Learning [Paper]
--------> ? Bias in datasets is certainly a prevalent issue, and creative approaches like the use of Generative Adversarial Networks and Reinforcement Learning to “correct” a dataset could have enormous implications on a multitude of issues.
- Recent Advances in Vision-and-Language Research [Workshop]
--------> ? This workshop covered a variety of different contemporary topics, including advanced attention, ensemble models and variations on popular models such as bidirectional encoder representations (BERT), and language bias reduction.
- Interpretable Machine Learning from Computer Vision [Workshop]
--------> ? This workshop worked on breaking into the black-box of advanced deep learning architectures such as deep CNNs and recursive neural networks to understand the seemingly elusive logic behind object/scene recognitions, image captioning, and visual question answering.
---------------------------------------------------------------------------------------------------------------
What new tools, heuristics, or insights have you come across in the fields of machine learning and computer vision? Do you agree/disagree with any of the above? Do you see certain methodologies distinctly rising to the foreground and others retreating in effectiveness? Don’t hesitate to comment or connect — I’m always eager to learn and discuss!
SDE(Frontend) @ W3Villa | MERN | Mobile and Web Developer ???? | Ex Coding Educator @Brightchamps???? | Open Source Contributor | Redux | WebSocket
4 年Thanks for sharing this with us.