FOD#51: No AGI without Computer Vision
TuringPost
Newsletter about AI and ML. ?? Sign up for free to get your list of essential AI resources ??
Next Week in Turing Post:
We recently started the computer vision (CV) history series , believing that the next big breakthroughs in the pursuit of Artificial General Intelligence (AGI) critically depend on advancements in CV, a field spearheaded by pioneers like Stanford’s Professor Fei-Fei Li. And Professor Li didn’t make us wait long. Known for developing ImageNet, which has been foundational to spatial AI development, last week she launched a venture (already backed with funding from a16z) aimed at enhancing AI's reasoning through spatial intelligence. This approach allows AI to comprehend three-dimensional spaces and dynamics, vital for complex tasks in diverse environments.
Fei-Fei Li wants to bridge gaps in AI's environmental interactions, similar to Yann LeCun’s efforts with his JEPA family . I-JEPA, Meta's advanced image processing model, leverages self-supervised learning to excel in tasks like object detection and image classification, without needing labeled datasets. Similarly, V-JEPA revolutionizes video analysis by predicting video sequence gaps and supporting applications in automated video editing, surveillance, and educational tools. LeCun always insists that despite advancements in natural language processing (NLP) with models like GPT, visual perception remains crucial for AI's interaction with the world. Having that “in mind,” an AI will be able to plan and reason based on visual inputs. With spatial intelligence, Fei-Fei Li plans to enhance AI's ability to emulate human cognitive skills in perceiving and engaging with the physical world.
The field's growth, driven by deep learning and convolutional neural networks, has made it possible for AI to process visual information akin to human sight, setting the stage for future breakthroughs that could seamlessly integrate AI into our daily life.
The rhetoric that comes from academics differs drastically from that of Sam Altman, who in a recent interview with another Stanford professor, stated that it doesn’t matter to him whether the annual expenditure is $5 billion or $50 billion; his focus is on creating AGI. What AGI (or Superintelligence, which OpenAI recently adopted as the main term and goal) entails is not described. So far, it seems that it involves the rollout of more sophisticated language models such as GPT-5 and GPT-6. For sure, both Altman and the GPTs are phenomenal in generating text, but as the push for spatial intelligence reminds us, human cognitive prowess isn't just about mastering language – it's about understanding the whole scene.
The AI Quality conference
Our friends from the MLOps Community are hosting a conference, and it’s a must-visit. First: the quality of speakers and content. Second: the vibe. You will learn, make important contacts, and enjoy your time.
As many people say: “The field is moving so fast, its hard to tell what is true vs false, what is good practice vs outdated”, the AI Quality conference hosted on June 25th in San Francisco aims to spotlight common problems, answer questions, and outline solutions for you and your team to be more successful with your AI endeavors. Among the speakers will be practitioners from Open AI, Anthropic, LlamaIndex, W&B, Reddit, and others! →agenda?
Twitter Library
领英推荐
News from The Usual Suspects ?
Microsoft Expands Its AI Safety Roster from 350 to 400 personnel to enhance trust in AI-generated content. This initiative includes deploying 30 responsible AI features and aligns with the National Institute for Standards and Technology's guidelines →read more
Cohere:
JPMorgan Taps AI for Thematic Investment with IndexGPT, an AI-driven tool that utilizes OpenAI's GPT-4 for creating thematic investment baskets. This innovation reflects Wall Street's continued foray into AI-enhanced financial solutions, aimed primarily at institutional clients →explore details
Alibaba Unveils Qwen1.5-110B, marking its entry into the 100B+ parameter model echelon. The model boasts multilingual support, efficient serving, and a competitive edge against current SOTA models, promising enhanced scalability and performance →discover more
Additional reading: One Year of Ranking Chinese LLMs by ChinAI ?
AI21's Enterprise Move with Jamba-Instruct AI21 has rolled out Jamba-Instruct, an enterprise-optimized version of its Jamba model, now available for commercial use. This model stands out in tasks requiring extensive context and promises reliable performance for enterprise applications →read announcement
OpenAI Partners with Stack Overflow to Boost Developer Tools In a strategic move, OpenAI teams up with Stack Overflow to integrate OverflowAPI into its services. This partnership will enrich OpenAI’s models with Stack Overflow’s trusted content, enhancing both developer productivity and AI accuracy. The planned OverflowAI project is set to launch in 2024, marking a significant advancement in developer resources →read more
DrEureka (Nvidia):
The freshest research papers were published. We categorized for your convenience ????