登录查看更多内容

Genie – The "World’s Best AI Software Engineer", The Dawn of Automated Science, Grok-2 Released … and more

Shelf

The Shelf platform structures and updates raw corporate data, empowering GenAI and LLMs with sharper decision-making.

发布日期: 2024年8月20日

Welcome to AI Weekly Breakthroughs, a roundup of the news, technologies, and companies changing the way we work and live.

Grok-2 Beta Released

The beta release of Grok-2, a cutting-edge language model, introduces two models, Grok-2 and Grok-2 mini, both available on the ?? platform. Grok-2 is a significant upgrade from Grok-1.5, excelling in chat, coding, and reasoning. It outperforms competitors like GPT-4-Turbo and Claude 3.5 Sonnet in various benchmarks, including math and science reasoning. While Grok-2 is optimized for detailed tasks, Grok-2 mini balances speed and quality. Both models will soon be accessible via an enterprise API with advanced security features. Grok-2’s rollout highlights enhanced real-time information processing and vision understanding, promising future multimodal capabilities.

Anthropic’s Prompt Caching for Developers

Anthropic has introduced prompt caching for developers using Claude 3.5 Sonnet and Claude 3 Haiku, with support for Claude 3 Opus coming soon. Prompt caching allows users to cache large amounts of context, improving efficiency by reducing latency by up to 85% and costs by up to 90%. It is particularly useful for tasks like conversational agents, coding assistants, and large document processing, where repeated context is required. Cached prompts are priced based on token usage, offering significant cost savings compared to traditional input tokens. Notion has adopted this feature, optimizing its AI assistant for faster and cheaper performance.

AI Dominates at Pixel Event

Google's recent Pixel event, while expected to focus on hardware like the Pixel 9 lineup, was heavily dominated by AI discussions. Rick Osterloh kicked off the event by emphasizing Google's AI efforts, with much of the first 25 minutes dedicated to the company’s Gemini AI models and their integration across Google’s major platforms like Search, Gmail, and Android. One highlight was Gemini Live, a conversational AI tool for brainstorming and practicing interviews, available to Android users. Even when the new Pixel devices were discussed, AI remained central, from Gemini features on screens to AI-driven photo enhancements. Google seems to be positioning AI as its key differentiator from competitors like Apple and Samsung. However, some remain skeptical about the practical appeal of these AI features.

AMD Will Acquire Infrastructure Company ZT Systems for $4.9B?

AMD has announced its acquisition of ZT Systems for $4.9 billion, aiming to enhance its AI ecosystem and compete more effectively with Nvidia. This deal, consisting of cash, stock, and a potential $400 million contingent payment, will integrate ZT Systems' expertise in computing infrastructure design into AMD's portfolio. The acquisition, expected to close in the first half of 2025, will strengthen AMD's capabilities in AI systems design, data center infrastructure, and customer support. AMD plans to leverage ZT Systems' experience to boost its AI hardware and software offerings, aiming to provide comprehensive data center solutions for cloud and enterprise clients.

World Labs, Fei-Fei Li’s New Startup, Snags $100M Funding

World Labs, a new AI startup founded by Stanford professor Fei-Fei Li, has recently closed a $100 million funding round led by NEA, elevating its valuation to over $1 billion. This latest round follows an initial April financing that valued the company at $200 million. World Labs aims to advance AI by developing models capable of creating detailed 3D digital replicas of real-world objects and environments, which could significantly impact fields such as gaming and robotics. Li, renowned for her pioneering work on ImageNet, seeks to address the challenge of limited 3D data collection in AI applications.

Genie – The World’s Best AI Software Engineer

Cosine, a UK-based AI startup, has announced a groundbreaking advancement in AI software engineering with its model, Genie, which it claims is the "world's best AI software engineer." Genie has achieved a record-breaking score of 30.08% on SWE-Bench, surpassing the previous best of 19.27% by Factory Code Droid, and significantly outstripping other models like GPT-4. This achievement is attributed to Cosine's innovative approach of emulating human reasoning and training Genie on proprietary data from real-world software engineering scenarios. The company has also secured $2.5 million in seed funding, led by SOMA and Uphonest Capital, to further enhance Genie's capabilities and integrate it with tools like GitHub.

Google Updates AI Overviews?

Google has introduced several updates to its AI Overviews feature in Search, aimed at enhancing user experience. The first update allows users to save AI Overviews for future reference, which can be accessed under their profile's Interests page. The second feature simplifies complex AI-generated responses by providing a "Simpler" button, making answers more concise and easier to understand. Additionally, Google is testing a right-hand link display on desktop to help users access more relevant websites. These updates, available via Search Labs, are rolling out globally and expanding AI Overviews to six more countries, including the UK, India, and Japan.

Google Releases Pixel Buds Pro 2, Built for Gemini

The Pixel Buds Pro 2 are Google's latest earbuds, featuring the new Tensor A1 chip for enhanced audio performance and AI integration. They offer twice the noise cancellation of the previous model, thanks to advanced adaptive technology that adjusts to your environment. The design is 24% lighter and 27% smaller, ensuring a comfortable and secure fit with customizable eartips. Equipped with AI-powered Gemini, the buds provide hands-free assistance for tasks like navigation and reminders, even when your phone is locked. Additional features include spatial audio with head tracking, clear calling, and improved battery life of up to 8 hours.

Google Launches Imagen 3 AI Image Generator

Google has launched Imagen 3, its latest AI text-to-image generator, for users in the US through the AI Test Kitchen and Vertex AI platforms. Imagen 3 offers improved detail, lighting, and fewer artifacts compared to previous versions. Users can generate and edit images by highlighting specific areas, but the tool has restrictions against creating images of public figures and copyrighted characters. Despite these limitations, users have found ways to generate images resembling popular characters like Sonic and Mario. The launch of Imagen 3 comes amidst competition with other AI tools, such as Elon Musk's Grok, which has fewer content restrictions.

Generative AI 4 个月前

The generative AI bill is coming due, and it’s not…

Fast Company 1 年前

OpenAI's Latest AI Model Can Perform Some Human-Like…

Bloomberg News 2 个月前

Grammarly Launches Authorship, a New AI Detection Tool

Grammarly is launching a new tool called Grammarly Authorship, aimed at detecting whether text was written by a human, generated by AI, or a combination of both. Unlike traditional AI detectors, Authorship tracks the entire writing process, identifying text that was typed, copied, or created by AI. Targeted at the education sector, the tool seeks to address issues like false positives in student work flagged as AI-generated. Authorship will be available in Google Docs in beta next month, expanding to Microsoft Word and Apple's Pages by year-end, and will be accessible across all Grammarly plans, including the free version.

MIT Researchers Launch AI Risk Repository

MIT researchers have launched a comprehensive AI risk repository to address gaps in existing frameworks and assist policymakers, companies, and researchers in identifying and managing AI-related risks. This extensive database, which catalogs over 700 AI risks across various domains and subdomains, aims to provide a thorough and accessible resource for understanding AI risks beyond what is currently covered by existing frameworks. By analyzing and categorizing risks such as privacy, security, misinformation, and discrimination, the repository seeks to enhance oversight and inform regulatory efforts. The MIT team plans to use this repository to evaluate how effectively different risks are addressed and to highlight areas needing greater attention in AI safety and regulation.

OpenAI Releases SWE-bench Verified

OpenAI has released SWE-bench Verified, an improved and human-validated subset of the original SWE-bench, which evaluates AI models' capabilities in solving real-world software issues. SWE-bench has been updated to address problems like overly specific unit tests, ambiguous issue descriptions, and difficulties in setting up development environments. The new dataset, curated with the help of professional software developers, filters out problematic samples to ensure more accurate benchmarking. On SWE-bench Verified, models like GPT-4 perform significantly better, with improved scoring that reflects the true capabilities of AI in software engineering tasks. This effort is part of OpenAI’s Preparedness Framework for assessing AI model autonomy.

Framework for Fully Automated Scientific Discovery

One of the larger challenges of AGI is developing agents capable of conducting scientific research and discovering new knowledge. While frontier models have already been used to help human scientists (e.g. for brainstorming ideas, writing code, or prediction tasks), they still conduct only a small part of the scientific process. This paper presents the first comprehensive framework for fully automatic scientific discovery, enabling frontier LLMs to perform research independently and communicate their findings. The authors introduce The AI Scientist, which generates novel research ideas, writes code, executes experiments, visualizes results, describes its findings by writing a full scientific paper, and then runs a simulated review process for evaluation. The code is open-sourced at GitHub.?

EliseAI lands $75M for chatbots that help property managers deal with renters

NEA led a $100M round into Fei-Fei Li’s new AI startup

The AI Conference 2024 - San Francisco - September 10 - 11

Dreamforce - San Francisco - September 17-19

World Summit AI - Amsterdam - October 9 - 10?

Gitex Global - Dubai - October 14 - 18?

Big Data Conference Europe - Vilnius - November 19 - 22

AWS re:Invent 2024 - Las Vegas - December 2 - 6?

The AI Weekly Breakthrough

1,012 位关注者

Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

3 个月

The emphasis on "World's Best AI Software Engineer" for Cosine's Genie raises questions about the criteria used to define such a title. Benchmarking AI performance in software engineering tasks requires standardized metrics beyond traditional accuracy, encompassing factors like code quality, efficiency, and adaptability to evolving requirements. Given Anthropic's focus on prompt caching, how might this functionality be integrated with Genie to enhance its ability to generate more contextually relevant and efficient code solutions?

2 次回应

查看更多评论

要查看或添加评论，请登录

Genie – The "World’s Best AI Software Engineer", The Dawn of Automated Science, Grok-2 Released … and more

Shelf

The Shelf platform structures and updates raw corporate data, empowering GenAI and LLMs with sharper decision-making.

Grok-2 Beta Released

Anthropic’s Prompt Caching for Developers

AI Dominates at Pixel Event

AMD Will Acquire Infrastructure Company ZT Systems for $4.9B?

World Labs, Fei-Fei Li’s New Startup, Snags $100M Funding

Genie – The World’s Best AI Software Engineer

Google Updates AI Overviews?

Google Releases Pixel Buds Pro 2, Built for Gemini

Google Launches Imagen 3 AI Image Generator

领英推荐

Grammarly Launches Authorship, a New AI Detection Tool

MIT Researchers Launch AI Risk Repository

OpenAI Releases SWE-bench Verified

Framework for Fully Automated Scientific Discovery

The AI Weekly Breakthrough

1,012 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

This week's latest generative AI updates - September 10, 2024

This AI newsletter is all you need #40

AI news

Disrupt or Be Disrupted: New AI-Based Business Models

AI Newsletter

AI Newsletter

AI Newsletter

This week in Mundo Data-Driven, august 3, 2024

Deep Dive into Llama3 (Popular LLM Series)

Accenture Pioneers Custom Llama LLM Models with NVIDIA AI Foundry

Grok-2 Beta Released

Anthropic’s Prompt Caching for Developers

AI Dominates at Pixel Event

AMD Will Acquire Infrastructure Company ZT Systems for $4.9B?

World Labs, Fei-Fei Li’s New Startup, Snags $100M Funding

Genie – The World’s Best AI Software Engineer

Google Updates AI Overviews?

Google Releases Pixel Buds Pro 2, Built for Gemini

Google Launches Imagen 3 AI Image Generator

领英推荐

Grammarly Launches Authorship, a New AI Detection Tool

MIT Researchers Launch AI Risk Repository

OpenAI Releases SWE-bench Verified

Framework for Fully Automated Scientific Discovery

The AI Weekly Breakthrough

1,012 位关注者

GPT-5 on the Horizon? Project Orion coming this year.

2024年9月6日

Game-changing Clinical AI, Humanoid Robot Responds to Natural Speech, Is Nvidia Scraping ‘Human Lifetime’ of Videos … and more

2024年8月14日

AI Chip Startup Groq Takes on Nvidia, Your New AI Best Friend, AI that Reprograms Cancer Cells ... and more

2024年8月7日

Llama 3 Goes for the Gold, OpenAI Tests SearchGPT Prototype, Apple Commits to AI Safety ... and more

2024年8月4日

Vampire Drones Harvest Energy from Power Lines, OpenAI's Unveils GPT-4o mini, McKinsey Reports GenAI Adoption Spikes ... and more

2024年7月24日

OpenAI and the Strawberry Project, The Guma City Robot Suicide Mystery, Clock Starts on EU AI Act Deadlines ... and more

2024年7月18日

China's AI Competition Deepens, Mindreading AI Turns Brainwaves into Images, Meta Unveils 3D GenAI ... and more

2024年7月10日

AI Takes the VW Front Seat, Ilya Sutskever's Safety-First Venture, Stability AI Re-stabilizes ... and more

2024年6月27日

Former General Joins OpenAI Board, Pope Weighs in on Techno-Human Condition, NVIDIA's Synthetic Data Pipeline ... and more

2024年6月19日

OpenAI and Apple Partner Up, Zoom Wants AI Clones in Meetings, OpenAI's Sora Soars at Tribeca ... and more

2024年6月12日

社区洞察

其他会员也浏览了

This week's latest generative AI updates - September 10, 2024

This AI newsletter is all you need #40

AI news

Disrupt or Be Disrupted: New AI-Based Business Models

AI Newsletter

AI Newsletter

AI Newsletter

This week in Mundo Data-Driven, august 3, 2024

Deep Dive into Llama3 (Popular LLM Series)

Accenture Pioneers Custom Llama LLM Models with NVIDIA AI Foundry