First contact with Claude 3.5 V2 Computer Use
This article represents my views and opinions and do not represent my employer.
Specifically using the Computer use functionality.
tl;dr it's amazing, it can code, debug, rationalize and work around issues.
What’s the focus here?
On 22nd of October the announcement of the the new Claude Sonnet 3.5 V2, read here. Inside this announcement was a public beta for Computer Use; tooling for Claude that gives it access to desktop actions, text editing and bash commands. In this article I dive into: general web usage, adding a mustache to a goat and developing an entire three-tiered web application using React, Vite, Express, SQLLite, Prisma and Typescript. All just by using a handful of vague prompts.
"Claude 3.5 Sonnet now offers computer use capabilities in Amazon Bedrock in public beta, allowing Claude to perceive and interact with computer interfaces."
As of this release, Claude now has:
Claude already was able to review images using its multi-modal LLM capabilities, it uses screenshots and the bash output to review its actions and begin on the next action.
Getting started with the functionality.
Anthropic has released a quick start repository here for computer use. I utilized the bedrock functionality and launched the demo with no issues at all. (after first enabling Claude Sonnet 3.5 v2 in bedrock).
In the quick start you are presenting with a Linux virtual machine and a chat window.
I got started by asking it to complete a simple task.
Search up Joshua Toth at AWS and tell me about him.
Claude:
It did this relatively quickly. It also tried to visit LinkedIn but got blocked with a login screen. It then visited several other websites including my personal website and my medium page.
Something More complicated
Next up I asked it to add a mustache to the picture of a goat.
The chatbot then proceeded to:
Here is the image:
The mustache wasn't right so it then tried 3 times to place the mustache in the correct position, although in the end the mustache wasn't exactly where a human would have put it.
Wow.
From installing missing programs to debugging errors and pivoting approach I was blown away. Even if it did sort of miss the point of adding a mustache to the goat.
Developer actions
I'm a developer, and I wanted to check my "I'm in danger" radar a little. Here is my initial prompt on the next stage:
"I would like to create a basic website using the react VITE framework, have it running in developer mode. I would like to display just a simple landing page for a product called: "Prototyping demos for everyone" where users eventually will be able to share their prototype demos through url links that they submit. I only need the landing page functionality and a starting application I can build on."
Cool, a landing page running on a locally hosted developer environment:
I then added more complicated requirements
"I would like to add an express server using typescript that acts as the API for the functionality I would like to allow for users to upload an image, a prototype description, a url and other fields you think would be relevant. I want to add an index page to view the submissions and how many times they were clicked. I want to add styling to the page to be branded like AWS"
Here is a fully working three tiered web application. I can submit data and track clicks. Claude
The form for submitting the data was also validated.
I then asked:
“now replace the components of the frontend with cloudscape components and styling. also convert the frontend to typescript“
Which it did.
“can you add a cloudscape site navigation with the submit demo and browse demo pages within”
can you convert the browse demos "cards" to a table instead.
Not only did it convert the cards to a table, it also added searching, filtering and pagination settings for the table.
can you split the description into its own column and allow all of the text in that column to wrap so it can be read.
In this instance, it understood I wanted a separate column for the description, although the text wrapping didn't occur.
This is the summary of all the actions it did:
YOUR REQUESTS:
ACTIONS TAKEN:
Initial Setup:
Frontend Development (First Phase):
Backend Development:
AWS Cloudscape Integration:
Table Conversion:
AWS Styling:
Final Configuration & Testing:
File Structure Created:
demo-sharing-platform/
├── src/
│ ├── components/
│ │ ├── PrototypeList.tsx
│ │ └── SubmitPrototype.tsx
│ ├── types/
│ │ └── prototype.ts
│ ├── styles/
│ │ ├── variables.css
│ │ └── App.css
│ ├── App.tsx
│ └── main.tsx
├── server/
│ ├── src/
│ │ └── index.ts
│ ├── prisma/
│ │ └── schema.prisma
│ └── uploads/
└── package.json
Key Features Implemented:
AWS Cloudscape Components Used:
The final result is a full-stack application that:
All components are properly typed with Typescript, follow AWS design patterns, and provide a consistent user experience similar to the AWS Console.
All throughout this process there were instances where Claude ran into issues such as types and missing files. Claude then made decisions based on those errors, such as rewriting files, adding missing directories, installing dependency packages etc.
Some Small issues
Throughout the process there were a couple of time where there were some issues with the bot interacting with the virtual environment.
Conclusion
The introduction of Computer Use functionality in Claude Sonnet 3.5 V2 represents a significant leap forward in AI capabilities. Through my testing, I've witnessed firsthand how Claude can navigate complex development tasks, from simple image manipulation to building a full-stack web application with AWS Cloudscape components. What's particularly impressive is not just the ability to execute commands, but Claude's capacity to problem-solve, debug, and pivot when encountering obstacles.
While there were some minor issues with the virtual environment and server management, these were largely environmental rather than limitations of Claude itself. The AI demonstrated remarkable adaptability, installing missing dependencies, handling errors, and making informed decisions throughout the development process.
For developers and technical professionals, this technology represents both an opportunity and a challenge. It's clear that AI tools like Claude's Computer Use functionality can significantly accelerate development workflows.
As this technology continues to evolve, it will be fascinating to see how it shapes the future of software development and technical problem-solving. For now, it's clear that we're witnessing a transformative moment in how we interact with and utilize AI in practical, hands-on development scenarios.
Head of ANZ Public Sector Enterprise Support @ Amazon Web Services
4 个月Thank you, I really enjoyed you're article. It helped me understand the practical applications of the computer use update.
Helping public sector customers benefit from cloud adoption
4 个月Great write up Joshua ??