Google is reportedly developing a ‘computer-using agent’ AI system
Rajat Kapoor
MCA-AIML | Chandigarh University | Data Science |?Python ?C++ ?VBA|SQL|?ML?DL?AI |?Open CV ?NLP ?Transformer ?MLOOP |?Android Development ?Firebase| AWS |?Flask ?Django ?Docker | ?Figma ?Canvas Designs ?Adobe | Badminton
Google’s Project Jarvis: Automating Web-Based Tasks with AI Innovation
In the coming months, Google may introduce “Project Jarvis,” an AI assistant designed to streamline online tasks directly within Chrome. This project, reportedly powered by the next generation of Google’s Gemini model, aims to make repetitive digital actions—like gathering research, shopping, or booking flights—faster and more intuitive. Jarvis captures and interprets screen details, performing clicks, and text entries to handle steps users would normally complete manually.
The AI ecosystem is racing forward with similar innovations. Microsoft’s Copilot Vision, Apple’s Intelligence, and Anthropic’s Claude all explore ways to enhance productivity and digital interactions. With Jarvis’s December debut on the horizon, Google is planning a limited launch for testing, ensuring a refined user experience.
The Jarvis project highlights Google’s commitment to making AI an integral, everyday tool, capable of automating complex online tasks while seamlessly integrating with Chrome’s browsing experience. For professionals, this tech could redefine productivity, particularly for routine and research-heavy roles. As we move closer to launch, the impact of Jarvis—and other AI automation solutions—will undoubtedly change how we interact with digital environments.
领英推荐
Google could preview its own take on Rabbit’s large action model concept as soon as December, reports The Information. “Project Jarvis,” as it’s reportedly codenamed, would carry tasks out for users, including “gathering research, purchasing a product, or booking a flight,” according to three people the outlet spoke with who have direct knowledge of the project.
Powered by a future version of Google’s Gemini, Jarvis reportedly only works with a web browser (it’s tuned specifically for Chrome). The tool is aimed at helping people “automate everyday, web-based tasks” by taking and interpreting screenshots and then clicking buttons or entering text, The Information writes. In its current state, it apparently takes “a few seconds” between actions.
The biggest AI companies are all working on models that do things like what The Information is describing. Microsoft’s Copilot Vision will let you talk with it about webpages you’re viewing. Apple Intelligence is expected to be aware of what’s on your screen and do things for you across multiple apps at some point in the next year. Anthropic debuted a “cumbersome and error-prone” Claude beta update that can use a computer for you, and OpenAI is reportedly working on a version of that, too.
The Information cautions that Google’s plan to show Jarvis off in December is subject to change. The company is reportedly considering releasing it to some small number of testers to find and help the company work out bugs.