登录查看更多内容

AI Week In Review: AI Agents are HERE, 5+ NEW AI Models & Much More!

AJ Green

Founder & CEO, AI Advantage | Keynote Speaker on Generative AI Chairman, Washington County Chamber Young Professionals Featured in AI Business Journal "Top 10 AI CEOs" Subscribe to my newsletter for AI daily news??

发布日期: 2024年10月26日

Welcome, AI entrepreneurs & enthusiasts.

What a wild week in AI! AI agents have taken center stage, with every major player rolling out advanced frameworks that turn AI from passive assistant to proactive operator. Microsoft’s autonomous Copilot agents and Anthropic’s new model that navigates digital interfaces signal a shift towards fully integrated, action-ready AI.

In parallel, the world of image and video generation has seen a cascade of breakthroughs, with new releases from nearly every industry leader. Genmo, Meta, and Runway are among those pushing the boundaries, democratizing high-quality visual creation for teams of all sizes.

With over ten game-changing models hitting the scene, this week has brought a tidal wave of innovation to the AI landscape—let’s dive into what’s new!

Anthropic's AI now navigates computers like a human

The News: Anthropic just introduced a new capability called ‘computer use’, alongside upgraded versions of its AI models, which enables Claude to interact with computers by viewing screens, typing, moving cursors, and executing commands.

The details:

Claude can now autonomously navigate computer interfaces, performing complex tasks across multiple applications and websites.
Anthropic said it taught the model ‘general computer skills’ instead of creating a standalone tool, helping it operate more like a human.
The upgraded Sonnet 3.5 significantly improves coding and tool use, outperforming other models (including o1-preview) on key benchmarks.
A new Haiku 3.5 model matches the capabilities of previous high-end models at lower cost and higher speed.
Anthropic highlighted that computer use is still imperfect (including some hilarious examples ), encouraging testing on low-risk tasks until skills improve.

Why it matters: While many hoped for Opus 3.5, Anthropic’s Sonnet and Haiku upgrades pack a serious punch. Plus, with the new computer use embedded right into its foundation models, Anthropic just sent a warning shot to tons of automation startups—even if the capabilities aren’t earth-shattering... yet.

Microsoft reveals autonomous Copilot agents

The News: Microsoft just announced that new agentic capabilities are coming to Copilot and Dynamics 365, allowing users to create their own or utilize pre-built agents to enhance processes across the platforms.

The details:

Ten pre-built agents will be introduced to Dynamics 365, specializing in areas like sales, service, finance, supply chain, and more.
The agents can operate independently, initiating tasks and responding to business signals without constant human oversight.
Copilot Studio will also allow users to create their own autonomous agents, moving from private to public preview next month.
The agents utilize OpenAI’s o1 model series, with features like encryption, data loss prevention measures, and guardrails for enterprise safety.

Why it matters: The agent revolution has felt close for a while now, and this Copilot infusion might be the first major step over the line. Microsoft calls them “the new apps for an AI-powered world”, which feels like a sharp analogy — soon workflows may simply be a matter of choosing which agent a user wants to call on for a specific task.

Inflection AI Introduces Agentic Workflows

The News: Inflection AI introduces Agentic Workflows as part of its Inflection for Enterprise platform, a major step toward empowering AI systems to take action on behalf of businesses. This release comes alongside the acquisition of automation experts Boundaryless, signaling Inflection's focus on global enterprise-scale solutions.

The Details:

Agentic Workflows merge AI intelligence with deterministic automation, creating business-aligned autonomous systems
Strategic UiPath partnership enables AI access to 1,400+ enterprise systems for real-time action
Boundaryless acquisition supercharges Fortune 500 deployment capabilities
Pioneering AQ (Action Quotient) as the new metric for AI effectiveness - measuring not just intelligence, but impact

Why it matters: This is AI's evolution from advisor to actor. Agentic Workflows transform AI from a conversational tool into an autonomous force that thinks AND acts within enterprise systems. By bridging intelligence with execution, Inflection AI isn't just automating tasks - it's creating AI colleagues that understand context, make decisions, and drive business outcomes. Welcome to the age of AI that doesn't just suggest - it delivers.

Meta reveals new AI models, tools

The News: Meta FAIR just introduced a collection of new research models and datasets, including an upgraded image segmentation tool, a cross-modal language model, solutions to accelerate LLM performance, and more.

The details:

Spirit LM is an open-source multimodal language model that integrates speech and text to generate more natural-sounding and expressive speech.
Meta’s SAM 2.1 update offers improved image and video segmentation on its popular predecessor, which saw over 700,000 downloads in 11 weeks.
Layer Skip provides an end-to-end solution for accelerating LLM generation times by nearly 2x without specialized hardware.
Other artifacts include SALSA for security testing, Meta Lingua for language model training, a synthetic data generation tool, and more.

Why it matters: Meta continues to push the AI bar forward with big releases across various areas. Given the company’s impressive open-source systems, it's hard to envision a future where closed models and tools have a significant advantage — and the moat between the two seems to be shrinking with each release.

Runway Launches 'Act-One' Transforming Character Animation

The News: Runway introduces Act-One , a groundbreaking tool to create expressive character performances using simple video inputs. This innovation is part of their Gen-3 Alpha platform, now rolling out to select users.

The Details:

Act-One allows creators to animate characters using basic video footage, simplifying the traditional animation pipeline.
It captures intricate performance details like eye-lines and facial micro-expressions, translating them into highly realistic character movements.
The tool can work across various character designs, making it adaptable for diverse animation styles and use cases.
Safety measures include content moderation tools to block unauthorized use of public figures and ensure the ethical use of generated voices.

Why it matters: Act-One marks a significant leap in generative AI applications for the media and entertainment industry. By simplifying complex workflows and maintaining high fidelity, this tool democratizes advanced animation techniques, traditionally reserved for big studios. Its versatility could shift the landscape, enabling smaller creators to produce high-quality content, expanding creative possibilities. As AI-driven tools like this grow, the barrier to entry for high-level animation continues to drop, creating a more inclusive creative ecosystem.

Genmo drops open-source AI video model

The News: AI startup Genmo just launched Mochi 1, a new open-source video generation model that claims to rival closed competitors like Runway, Pika, and Kling — while being freely available to developers and researchers.

The details:

Mochi is built on a new 10B parameter architecture called AsymmDiT, making it the largest open-source video generation model ever released.
The model focuses heavily on motion quality and prompt adherence, generating 480p videos at 30fps for up to 5.4 seconds.
Mochi surpassed top models like Kling, Runway Gen-3, Luma’s Dream Machine, and Pika in motion quality and prompt adherence during testing.
A higher-definition version, Mochi 1 HD, with 720p support and image-to-video capabilities, is planned for release later this year.
Genmo also announced that it secured $28.4M in Series A funding, with Mochi-1 being the company’s first step toward building ‘world simulators.’

Why it matters: Open-source AI video is officially competing with the top of the market. Genmo’s Mochi is an extremely impressive release that showcases how competitive the video generation landscape is about to become — especially with the major dominos (Sora, Midjourney?) still to come.

Ideogram debuts AI Canvas workspace

The News: Ideogram just unveiled a new AI-powered workspace called Canvas, introducing advanced tools like Magic Fill and Extend to combine image editing and generation for new creative workflows.

The details:

Canvas provides an endless digital board on which users can generate, organize, and seamlessly blend AI-generated and uploaded images.
Magic Fill allows precise editing of selected image areas, enabling tasks like object replacement, text addition, and background alteration.
The Extend feature expands images beyond their original dimensions while maintaining style consistency, even with text.
Ideogram also features an API, allowing developers to incorporate the new features into their own applications

Why it matters: The design industry is no stranger to AI tools (Photoshop, Canva) — but Ideogram’s latest release feels like the exact type of fastball that AI and design novices can really make magic with. The examples shown also illuminate how drastically creative workflows are changing in the AI era.

Salesforce 3 个月前

Microsoft's Copilot Upgrade, OpenAI's DevDay…

Dr. Joerg Storm 1 个月前

Microsoft's Copilot Upgrade, OpenAI's DevDay…

Paul Storm 1 个月前

Stability AI's Stable Diffusion 3.5 Goes PRO

The News: Stability AI just launched Stable Diffusion 3.5 , their most advanced image generation model yet, packed with features designed to empower everyone from hobbyists to professionals. The new release includes multiple customizable variants, all available for free under Stability AI’s community license and optimized to run on consumer hardware.

The Details:

Stable Diffusion 3.5 includes models like Large and Large Turbo, with a Medium version arriving on October 29. Each model is customizable, providing flexibility across a range of visual styles and use cases.
These models are optimized for both consumer and professional hardware, making high-end image generation accessible to a much wider audience.
Features like prompt adherence and diverse output capabilities mean users can generate everything from photorealistic images to creative 3D art with ease.

Why it matters: The world of AI image generation just got a serious upgrade with Stable Diffusion 3.5. For creators, startups, and hobbyists, this release offers an exciting toolkit that’s both powerful and accessible. It’s designed to run on everyday hardware, meaning you don’t need a high-end setup to generate high-quality visuals. This is the kind of release that could fundamentally shift creative workflows, making cutting-edge AI image generation available to everyone, from indie creators to professional designers.

Midjourney launches new image editor

The News: Midjourney just debuted a new AI-powered web editor that allows users to easily modify, retexture, expand, and stylize both generated and uploaded images using text prompts.

The details:

The anticipated update enables features like expand, crop, repaint, and modify for both Midjourney-created and uploaded images using natural prompting.
The new editor also works with Midjourney’s previous features like personalization, style references, and more.
A new re-texturing tool allows users to change aspects like lighting, texture, materials, and more while maintaining the original shape of the image.
Access is initially limited to yearly subscribers, 12-month members, and users with 10,000+ generations for testing before a broader roll-out.

Why it matters: If you were already concerned about discerning between AI and ‘real’ photos, things are about to get a lot more difficult. This new editor brings massive capabilities and use cases for creatives – but also unlocks a powerful deepfake and manipulation tool that makes nearly every image need to be questioned going forward.

DeepMind open-sources AI watermarking tool

The News: Google DeepMind just announced the open-sourcing and availability of SynthID, an advanced watermarking system for AI-generated content, and revealed the tool is already being used in Gemini and other Google products.

The details:

The system uses a ‘tournament sampling’ approach that embeds undetectable watermarks while preserving the quality of text outputs.
Tests across 20M real Gemini user interactions showed no impact on response quality or user satisfaction.
SynthID's watermarks also work across different content types, including embedding into audio waveforms, image pixels, and video frames.
DeepMind open-sourced the SythID code for use by other companies and developers, hoping the tool becomes standard across the industry.
The tool is already integrated into Gemini, ImageFX, VideoFX, and Vertex AI's image tools — marking the first large-scale deployment of AI watermarking.

Why it matters: If you’ve been following the AI boom, it’s clear that the lines between ‘real’ and AI-generated are already completely blurred. Google hopes that open-sourcing SythID will lead to an industry standard, but other rivals have been working on similar tools. Either way, the watermarking problem appears to be nearing a solution.

AI reaches expert level in medical scans

The News: Researchers at UCLA just developed SLIViT, a new AI model that can analyze complex 3D medical scans with expert-level accuracy in a fraction of the time required by human specialists.

The details:

SLIViT (SLice Integration by Vision Transformer) can efficiently analyze various 3D imaging types, including MRIs, CT scans, and ultrasounds.
The model matches clinical expert accuracy while reducing analysis time by a mind-blowing factor of 5,000.
Unlike other AI models, SLIViT requires only hundreds of training samples, making it more practical for real-world applications.
The framework leverages transfer learning, using prior knowledge from 2D medical data for efficient training with smaller 3D datasets.

Why it matters: With the growing demand for faster diagnostics, SLIViT’s ability to rapidly and accurately analyze imaging offers a potential game-changer for healthcare. The model’s ability to work with small datasets also makes it more accessible for providers with limited resources —?potentially democratizing expert medical imaging.

Biden orders AI push with new security safeguards

The News: The White House just issued a new national security memorandum directing federal agencies to accelerate AI adoption – while establishing clear boundaries for its use in sensitive government areas like defense and intelligence.

The details:

The memo outlined the government’s strategy for leveraging technology across national security departments while managing its risks.
The directive prohibits AI from tasks like making autonomous nuclear weapons decisions, targeting systems without human oversight, and privacy violations.
Biden highlighted the need for improvements to the U.S. AI chip infrastructure, also directing agencies to assist AI labs in defending against foreign espionage.
OpenAI published a blog alongside the memo explaining its stance on national security and the company’s partnerships and government initiatives.

Why it matters: AI safety and development are national security issues,?and the US is finally acting on its most comprehensive attempt to craft guardrails to prepare for the evolution to come. The memo’s emphasis on protecting private sector innovations also signals a major shift in treating commercial AI as crucial national security assets.

OpenAI Stuns TED AI: “20 Seconds of Thinking Beats 100,000x Data!”

The News OpenAI scientist Noam Brown made waves at the TED AI conference , introducing a transformative perspective on AI evolution that prioritizes "system two thinking" over raw data scaling.

The Details:

Brown emphasized that AI must move beyond simply scaling data and computational power, advocating for “system two thinking”—a deliberate, human-like reasoning approach that he believes is the next frontier for AI.
Reflecting on his early work, Brown revealed how adding just 20 seconds of decision time to Libratus (a poker-playing AI) achieved the impact of scaling the model by 100,000x.
Brown showcased OpenAI’s o1 model, a revolutionary new AI capable of system two thinking, which is designed for critical applications like scientific research and strategic decision-making.
The o1 model, launched in September, demonstrated breakthrough performance, scoring 83% on the International Mathematics Olympiad—far exceeding previous models.

Why it Matters This insight marks a pivotal shift in AI, challenging the industry’s focus on speed. Brown’s push for deliberate AI models points to a future where slower, more thoughtful AI systems excel in fields like healthcare, finance, and renewable energy. By prioritizing accuracy and deep reasoning, this “system two thinking” approach promises more reliable and impactful AI solutions, setting OpenAI apart as it redefines what’s possible in enterprise-grade AI.

Apple $1M Bug Bounty for Apple Intelligence

The News: Ahead of its major AI cloud release, Apple has launched a $1 million bug bounty, challenging security researchers to identify vulnerabilities in its Private Cloud Compute (PCC) infrastructure for Apple Intelligence.

The Details:

PCC, Apple’s secure cloud solution for running intensive AI operations, integrates advanced privacy protections into the cloud, mirroring the company’s hardware security model.
Apple’s newly public PCC Virtual Research Environment (VRE) offers researchers hands-on access to inspect PCC’s security design.
Rewards in the expanded Apple Security Bounty program now reach $1 million for discoveries that highlight vulnerabilities compromising PCC’s privacy and security promises.
Key PCC components, including critical security code, are available on GitHub for external review and independent validation.

Why It Matters: The eye-popping $1 million bounty shows just how high the stakes are in AI security. Apple isn't just protecting code - they're safeguarding their AI future and customer trust. By inviting the world's top security experts to stress-test PCC, they're setting a new industry standard for transparent, secure AI development. This aggressive move by a traditionally private company signals a shift in how Big Tech approaches AI security: no longer as just another feature, but as a make-or-break foundation for their AI ambitions.

Thanks for reading! Stay ahead of the curve:

Subscribe to our newsletter for regular updates.
Book a free consultation to explore our AI services.

With 400+ successful client partnerships and a team of 4,500+ experts, we're ready to help your business harness cutting-edge AI technology.

Interested in learning more? Let's connect.

The AI Advantage

1,530 位关注者

Kenan Causevic

freelancer

1 周

dopepics.io AI fixes this (AI Image Editor / Upscaler) ssing week’s #AInews; $1M opportunity.

Amy Packard Berry

3 周

AJ Green appreciate YOU!

Debbie Reynolds

3 周

AJ Green thank you for curating this content.

Daniela Almeida Louren?o

(V)CISO | Infosec Governance, transformation, culture | C/CISO, CISSP, CISM | Out-of-Band Speaker

3 周

Meanwhile, Google closes vulnerability reports about their Gemini teaching you how to make a bomb under "intended behaviour" status.

1 次回应

Frank La Vigne

AI and Quantum Engineer with a deep passion to use technology to make the world a better place. Published author, podcaster, blogger, and live streamer.

3 周

Nothing about Granite 3?

3 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

AI Week In Review: AI Agents are HERE, 5+ NEW AI Models & Much More!

AJ Green

Founder & CEO, AI Advantage | Keynote Speaker on Generative AI Chairman, Washington County Chamber Young Professionals Featured in AI Business Journal "Top 10 AI CEOs" Subscribe to my newsletter for AI daily news??

Welcome, AI entrepreneurs & enthusiasts.

Anthropic's AI now navigates computers like a human

Microsoft reveals autonomous Copilot agents

Inflection AI Introduces Agentic Workflows

Meta reveals new AI models, tools

Runway Launches 'Act-One' Transforming Character Animation

Genmo drops open-source AI video model

Ideogram debuts AI Canvas workspace

领英推荐

Stability AI's Stable Diffusion 3.5 Goes PRO

Midjourney launches new image editor

DeepMind open-sources AI watermarking tool

AI reaches expert level in medical scans

Biden orders AI push with new security safeguards

OpenAI Stuns TED AI: “20 Seconds of Thinking Beats 100,000x Data!”

Apple $1M Bug Bounty for Apple Intelligence

The AI Advantage

1,530 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

The Economic Incentives of Automation

Navigating the Journey from AI Demo to Production

?? Wiser! #118: AI CEOs | Salesforce NFTs | GPT Misinfo | NotionAI | Ikea Drones

Applying Generative AI to Enterprise Use Cases: A Step-by-Step Guide

What's in Tech : Wk of September 22nd 2024

Building a Multi-Agent LLM Chatbot from Scratch

APRIL 2024

Race to Build Your Own AI Copilot: Era of Cognitive Plumbing

Vin Matozzo: CEO of Paradigm AI on Agents Getting Smarter with a Reasoning Engine Autonomous AI Agents: Thinking, Planning, what's next?

AI as a Service: The Game-Changer for Modern Businesses

Welcome, AI entrepreneurs & enthusiasts.

Anthropic's AI now navigates computers like a human

Microsoft reveals autonomous Copilot agents

Inflection AI Introduces Agentic Workflows

Meta reveals new AI models, tools

Runway Launches 'Act-One' Transforming Character Animation

Genmo drops open-source AI video model

Ideogram debuts AI Canvas workspace

领英推荐

Stability AI's Stable Diffusion 3.5 Goes PRO

Midjourney launches new image editor

DeepMind open-sources AI watermarking tool

AI reaches expert level in medical scans

Biden orders AI push with new security safeguards

OpenAI Stuns TED AI: “20 Seconds of Thinking Beats 100,000x Data!”

Apple $1M Bug Bounty for Apple Intelligence

The AI Advantage

1,530 位关注者

OpenAI’s Evidence ‘Glitch’: The Mistake That Could Cost Billions

2024年11月22日

1/1500 the Size, 2 Months Flat: Chinese Startup Unveils Open-Source OpenAI o1 Alternative

2024年11月21日

The Future of Work Is Here: Microsoft’s AI Agents Are Redefining Work for 1 Billion+ Users

2024年11月20日

Mistral vs. Silicon Valley: Is Europe Closing the AI Innovation Gap?

2024年11月19日

Musk vs. Altman Exposed: Early Power Struggles That Shaped OpenAI’s AI Empire

2024年11月18日

AI Week in Review: AGI Predictions, Agentic Tools, and What It Means for You

2024年11月16日

ChatGPT + Desktop = New Era of AI Collaboration...

2024年11月15日

Is OpenAI’s ‘Operator’ the AGI Breakthrough We’ve All Been Waiting For?

2024年11月14日

AI Robots Learn Surgery From Videos... Is Any Job Safe?

2024年11月13日

AlphaFold 3 is Now Open-Source—How This Nobel-Winning AI Will Change Biotech Forever

2024年11月12日

社区洞察

其他会员也浏览了

The Economic Incentives of Automation

Navigating the Journey from AI Demo to Production

?? Wiser! #118: AI CEOs | Salesforce NFTs | GPT Misinfo | NotionAI | Ikea Drones

Applying Generative AI to Enterprise Use Cases: A Step-by-Step Guide

What's in Tech : Wk of September 22nd 2024

Building a Multi-Agent LLM Chatbot from Scratch

APRIL 2024

Race to Build Your Own AI Copilot: Era of Cognitive Plumbing

Vin Matozzo: CEO of Paradigm AI on Agents Getting Smarter with a Reasoning Engine Autonomous AI Agents: Thinking, Planning, what's next?

AI as a Service: The Game-Changer for Modern Businesses