OpenAI and the Strawberry Project, The Guma City Robot Suicide Mystery, Clock Starts on EU AI Act Deadlines ... and more
Welcome to AI Weekly Breakthroughs, a roundup of the news, technologies, and companies changing the way we work and live.
OpenAI and the Strawberry Project
OpenAI is developing a new AI technology under the code name “Strawberry,” aimed at enhancing model reasoning capabilities. Internal documents reviewed by Reuters reveal that Strawberry is designed for AI to navigate the internet autonomously and perform deep research, capabilities current models lack. The project uses post-training methods to fine-tune performance, potentially similar to Stanford’s “Self-Taught Reasoner" method. While specific details remain undisclosed, Strawberry focuses on complex, long-horizon tasks, improving AI’s planning and multi-step problem-solving.
The Gumi City Robot Suicide Mystery
In a unique and tragic incident, a robot working for the Gumi City Council in South Korea may have committed suicide by throwing itself down a 2-meter staircase. Nicknamed Robot Supervisor, the robot worked regular shifts, had an employee card, and was integrated into the council’s daily operations. The cause of the suicide is unknown, though it is speculated that it may not have been an accident due to the robot’s familiarity with using elevators. The broken pieces are being examined to determine the reason behind the incident and the Gumi City Council has decided not to replace the robot immediately.
AI Pioneer Karpathy Launches Eureka Labs
Andrej Karpathy, former AI lead at Tesla and OpenAI, has launched Eureka Labs, an "AI native" education platform to integrate AI assistants into traditional teaching methods. Based in San Francisco, Eureka Labs aims to create AI teaching assistants to guide students through course materials while human teachers design the content. The first offering, an AI course called LLM101n to teach students how to train their own AI, will be available online with digital and physical cohorts. Karpathy’s new venture reflects his long-standing passion for AI and education, but many details about the startup’s funding, partnerships, and business model remain unclear.
Big Tech Avoids OpenAI Board Amid Regulations
Microsoft and Apple have decided not to take board seats at OpenAI amid increasing regulatory scrutiny of big tech’s role in AI. Microsoft, having invested $13 billion in OpenAI, stated their observer role is no longer necessary due to significant progress by OpenAI’s new board. Contrary to reports, Apple will also not take an observer role. OpenAI thanked Microsoft for their support and confirmed their ongoing partnership. This move comes as EU regulators consider an antitrust investigation into OpenAI's partnership with Microsoft, reflecting broader concerns about big tech’s influence on AI. Legal experts suggest that stepping back from direct board involvement might be a strategic move by Microsoft and Apple to mitigate potential regulatory challenges.
Cohere and Fujitsu Partner for Japanese Enterprise AI Services
Cohere and Fujitsu announced a strategic partnership to develop enterprise AI services with advanced Japanese language capabilities. This collaboration will create secure, enterprise-grade LLMs for global businesses, focusing on privacy and regulatory compliance, particularly in finance, public sector, and R&D. Fujitsu will exclusively provide these services, leveraging Cohere’s state-of-the-art AI models to enhance business productivity and efficiency.
Perplexity and AWS Launch Enterprise Pro AI Service
Perplexity has teamed up with Amazon Web Services to launch Perplexity Enterprise Pro, utilizing Amazon Bedrock for generative AI capabilities. This collaboration aims to enhance AI-powered research tools for businesses, boosting efficiency and productivity while maintaining security. The partnership includes joint events, co-sell engagements, and co-marketing efforts, marking a significant step in expanding Perplexity’s global reach.
Clock Starts on EU's AI Act Deadlines
The EU’s AI Act has been published in the bloc’s Official Journal, with the new law set to come into force on August 1, 2024. The regulation will be fully applicable by mid-2026 and takes a phased approach to implementation, addressing different risk levels for AI applications. High-risk AI use cases face strict obligations, while a small number of AI applications are banned. The AI Act aims to ensure transparency, data quality, and anti-bias measures, marking a significant step in regulating AI within the EU.
Anthropic Launches Claude Android App
Anthropic has launched its Claude Android app, aiming to expand the reach of its AI chatbot and compete with ChatGPT. The app, similar to its iOS version released in May, offers free access to the Claude 3.5 Sonnet model and additional features through Pro and Team subscriptions. It includes functionalities like syncing conversations across devices, real-time image analysis, and language translation. Despite Anthropic’s claims of its AI models being on par with those of OpenAI and Google, the company has struggled to attract consumers. The Claude iOS app had a modest start with 157,000 downloads in its first week, significantly less than ChatGPT’s 480,000 installs in the same period.
Samsung Releases Galaxy AI
Samsung has released Galaxy AI, a new innovation to enhance creativity and productivity across Galaxy devices. Key features include an intuitive search function with a circular gesture, FlexCam with auto-zoom for selfies, and Note Assist for lecture note transcription and summarization. Interpreter and Live Translate facilitate seamless multilingual communication, while Photo Assist offers advanced AI photo editing. Galaxy AI also integrates with Galaxy Watch Ultra and Galaxy Watch7 for detailed well-being tracking via an Energy Score.
Amazon's Rufus AI Shopping Assistant Now Available for All U.S. Customers
Amazon’s AI-powered shopping assistant, Rufus, is now available to all U.S. customers via the Amazon Shopping app. Rufus helps users make informed purchase decisions by answering product questions, offering recommendations, comparing options, and providing updates on orders. Since its introduction, customers have asked millions of questions, enhancing their shopping experience with personalized, real-time assistance.
Anthropic Doubles Token Limit for Claude 3.5 Sonnet
Anthropic has doubled the max output token limit for Claude 3.5 Sonnet from 4096 to 8192 in the Anthropic API. To access this beta feature, add the header "anthropic-beta": "max-tokens-3-5-sonnet-2024-07-15" to your API calls. This upgrade supports longer writing and coding tasks and is expected to be available on claude.ai soon, reflecting Anthropic’s quick shipping speed.
领英推荐
Google Introduces Vids to Workspace Labs
Google has introduced Vids, an AI-powered video creation app, to Workspace Labs. Designed for professional use, Vids is deeply integrated with the Workspace suite, allowing users to create videos without leaving the platform. Key features of Vids include high-quality templates and Gemini, an AI tool that helps users quickly draft initial content. The app also offers access to a royalty-free stock content library and a comprehensive recording studio for polishing videos.
Robot Navigates Google DeepMind Offices Using Gemini
Google DeepMind has showcased how its new robot, powered by the Gemini 1.5 Pro, navigates the office environment by responding to natural language commands. Utilizing the “Mobility VLA” system, the robot combines long-context vision-language models and topological graphs for navigation. Demonstrated through a series of videos, the robot guides employees to locations within the 9,000-square-foot office by interpreting commands and following mapped routes. The robot, familiarized with the space through demonstration tours, integrates multimodal instruction navigation and hierarchical Vision-Language-Action to understand and execute tasks. Google reports a high success rate of around 90% across over 50 interactions.
Thrive AI Health Scrutinized
In The Atlantic, Charlie Warzel examines Thrive AI Health, a new venture by Sam Altman and Arianna Huffington. Aiming to revolutionize health management with AI, the company promises to transform health outcomes and reduce costs. Warzel critiques these claims as speculative, highlighting that much of the AI discourse is based on future visions rather than present realities. He questions the practicality of AI health coaches and raises concerns about privacy and public trust. Warzel concludes that while faith in technological progress is important, blind faith in speculative promises can be dangerous.
LlamaIndex's Jerry Liu on the Future of Knowledge Assistants
In this talk, LlamaIndex founder & CEO Jerry Liu covers how we go beyond single-LLM prompt calls. He discusses advanced single-agent flows, Agentic RAG, multi-agent task-solvers & service architectures, and more. Liu also announces Llama Agents: Agents as microservices that are easily deployed and communicate via a single API.
Cognition AI's Scott Wu on Devin
In a live demo at the AI Engineer World's Fair in San Francisco, Scott Wu, co-founder and CEO of Cognition AI, introduced Devin, a cutting-edge AI software agent designed to help developers save time and boost productivity. Wu showcased Devin’s capabilities and shared insights from his team’s journey in building the agent. The presentation highlighted how Devin can streamline development tasks, drawing from Wu's extensive experience in AI and programming. The event emphasized the growing importance of AI tools in engineering.
SpreadsheetLLM Enhances AI Processing
A new study introduces "SpreadsheetLLM," an innovative method to enhance AI's capability in processing spreadsheets. The study also proposes the SheetCompressor framework, which compresses spreadsheet data effectively, improving performance by 25.6% over conventional methods in GPT-4’s in-context learning setting. Fine-tuned with SheetCompressor, LLMs achieved a 78.9% F1 score, surpassing existing models by 12.3%. This advancement highlights significant improvements in spreadsheet understanding and tasks.
Fireworks AI raises $52M Series B to lead industry shift to compound AI systems
Buildots, an AI-driven construction technology leader, secures $15M Intel Capital-led investment to fuel strategic growth
Deepfake-detecting firm Pindrop lands $100M loan to grow its offerings
AI startup Hebbia raised $130M at a $700M valuation on $13 million of profitable revenue
Defense AI startup Helsing raises $487M
Exa raises $17M from Lightspeed, Nvidia, Y Combinator to build a Google for AIs
Menlo Ventures and Anthropic team up on a $100M AI fund
AI4 – Las Vegas – August 12 – 14
The AI Conference 2024 – San Francisco – September 10 – 11
World Summit AI – Amsterdam – October 9 – 10
Gitex Global – Dubai – October 14 – 18
Big Data Conference Europe – Vilnius – November 19 – 22