Anthropic - Introducing New Computer Capabilities with Claude 3.5 Sonnet and Claude 3.5 Haiku

Anthropic - Introducing New Computer Capabilities with Claude 3.5 Sonnet and Claude 3.5 Haiku

In this newsletter, discover how Anthropic’s latest AI models, Claude 3.5 Sonnet and Haiku, are revolutionizing automation, coding, and AI-driven tasks like never before. Read the full blog here to explore the breakthroughs that could reshape your business operations.

Anthropic has launched two advanced AI models; Claude 3.5 Sonnet and Claude 3.5 Haiku, alongside a new computer use feature in a public beta. These innovations push the limits of automation, coding, and system navigation, empowering developers and businesses with faster, smarter tools.

Claude 3.5 Sonnet: Smarter Coding amp; Automation

Claude 3.5 Sonnet offers upgraded capabilities in software engineering. It improves performance on SWE-bench Verified from 33.4% to 49.0%, outperforming other models like OpenAI’s o1-preview. It has also improved on TAU-bench:

  • Retail domain: From 62.6% to 69.2%
  • Airline domain: From 36.0% to 46.0%

With no extra cost or delay, Sonnet excels in multi-step coding tasks. GitLab reported a 10% boost in DevSecOps reasoning, and The Browser Company found it ideal for automating web workflows.

Anthropic ensures Sonnet meets strict safety standards, working with AI Safety Institutes and aligning with its ASL-2 Standard under the Responsible Scaling Policy.

Claude 3.5 Haiku

Designed for fast, cost-efficient tasks, Claude 3.5 Haiku rivals Claude 3 Opus in performance. It scores 40.6% on SWE-bench Verified, surpassing earlier models and GPT 4o in key areas.

Haiku is perfect for real-time applications like data-intensive tasks and personalization from datasets, for instance, purchase history. Available later this month, it will initially support text-based tasks, with image input coming soon.

AI-Driven Computer Use

Anthropic’s public beta lets Claude perform tasks like a human; typing, clicking, navigating screens, and automating repetitive processes. Early adopters, like Replit, use it to automate UI testing. In tests, Claude 3.5 Sonnet outperformedther models with a 22% success rate under extended task conditions, compared to 7.8% by others.

While still in development, this feature faces challenges with scrolling or dragging tasks. Developers are encouraged to start with low-risk projects to explore its full potential. To prevent risks like spam or fraud, Anthropic has built classifiers to monitor and stop misuse. These measures align with its commitment to responsible AI deployment and automation.

What’s Next?

Claude 3.5 Sonnet is now available, and Claude 3.5 Haiku will launch later this month. Both models and the new computer use feature can be accessed via Anthropic’s API, Amazon Bedrock, and Google Cloud Vertex AI. Developers are encouraged to provide feedback as these tools evolve to unlock new possibilities.

要查看或添加评论,请登录

Arbisoft的更多文章

社区洞察

其他会员也浏览了