From RPA to AI: The Evolution of Computer Control and What’s Next

From RPA to AI: The Evolution of Computer Control and What’s Next

From RPA to Actionable-AI: The Evolution of Computer Control and What’s Next

Having spent years in the RPA/BPM space and now building at our startup, I’ve lived the evolution of computer automation with keen interest. When I first started exploring LLMs and agents, two companies caught my attention:

Adept and Inflection.

Each represented different possibilities in the quest for true AI-computer interaction — a journey that’s particularly relevant to what we’re building at DoozerAI.

The $4bn funded startup Inflection has now been swallowed by Microsoft — kind of an acqui-hire. Meanwhile, Adept remains independent, led by co-founders who invented the ‘Transformer’ architecture. Their goals mirrored what many of us in the automation space dreamed of: building a platform that could operate every system and application — an overlay on your computer that could take real action.

My co-founder and I both come from an RPA/BPM background (and now apply these lessons at DoozerAI), we both know that Intelligent Action is key. While content creation and data analysis are now table-stakes, the holy grail is AI that can actually execute tasks at scale. This was Adept’s vision with their ACT-1 transformer — a model designed to use any application or API across the internet, recently enhanced by their Workflow Language for composing interactions.

The Anthropic Game-Changer

But today, Anthropic has released something that could fundamentally reshape this landscape. Their announcement of computer use capabilities for Claude 3.5 Sonnet represents exactly what we’ve been working toward in the automation space — AI that can truly interact with systems the way humans do.

Why This Matters Now

The timing couldn’t be more significant. While Adept has been building specialized infrastructure for computer control, Anthropic has taken an approach that resonates with what we’ve learned at Doozer: teaching their model to use computers the way humans do. It’s genuine computer literacy for AI.

As well as improving upon models like OpenAI o1-preview, their benchmark results on OSWorld are telling — scoring 14.9% in screenshot-only tests and 22.0% with extended steps. These numbers might seem a little modest, but they’re nearly double the next best system. More importantly, they represent general-purpose capability rather than specialized tools. So you’ve got to imagine once you turn Claude to be an Insurance Underwriter using just underwriting systems, its going to be much higher.

Open AI for all

What sets this apart from previous attempts is the accessibility and implementation. While Adept built a specialized platform, Anthropic is offering this capability through standard APIs on major cloud platforms.

Early adopters are already showing what’s possible: - Replit using it for real-time app evaluation - GitLab seeing up to 10% stronger reasoning across DevSecOps tasks - The Browser Company finding it outperforms every model they’ve tested

Beyond Traditional RPA

In the past, we’ve all seen firsthand how traditional RPA is brittle — breaking when interfaces change and requiring constant maintenance. Anthropic’s new approach addresses these core challenges:

1. Adaptive Interaction: Instead of hard-coded scripts, it actually understands what it’s looking at 2. General Purpose: One system that can work across any interface 3. Learning Capability: The potential to improve through usage and feedback

What’s Next?

The real excitement isn’t in what we’re seeing today — it’s in what this enables. Think about:

  • Business automation that finally breaks free from brittle scripts — picture an AI that can handle a mortgage application one minute, process an insurance claim the next, and even navigate those ancient government portals that have frustrated automation attempts for years
  • Imagine AI developers that don’t just write code, but fire up VS Code, run tests, check GitHub issues, and deploy fixes — all while adapting to your team’s specific workflows and policies
  • Customer service agents that can actually hop between your CRM, billing system, and support tickets — fixing issues across multiple systems just like your best human agents do
  • Research assistants that don’t just search the web, but dive into your SharePoint, extract data from complex Excel models, and compile executive-ready PowerPoint decks

The Road Ahead

Yes, there are still limitations. Scrolling, dragging, and zooming are challenges. But remember where we were with language models just two years ago — progress can be exponential when the fundamental architecture is right.

Anthropic’s approach to safety is also noteworthy. They’ve developed specific classifiers to identify potential misuse, showing they’re thinking about responsible deployment from day one — something we’ve always prioritized at Doozer.

Looking Forward

While Adept pioneered the vision of AI computer control, and Inflection showed early promise before their Microsoft acquisition, Anthropic has just made it real and accessible.

The next few months will be crucial. As developers start building with these capabilities, we’ll see use cases we haven’t even imagined. The barriers between AI capabilities and actual computer usage are finally coming down.

For those of us building in this space — whether at DoozerAI where we’re focused on business process automation, or in other areas of the industry — this is the moment we’ve been waiting for. It’s not just about better AI — it’s about AI that can actually take goal-driven action in the real world.

The race for true AI computer control is far from over, but with this release, Anthropic has just changed the game entirely. As someone who’s been building automation solutions for years, I can say with confidence: the future of human-AI interaction just got a lot more interesting.

At DoozerAI we’ve built a platform to allow organizations to deploy digital co-workers at scale. Visit us at Doozer.AI to learn more about our journey, platform and Hunter our digital marketing co-worker. You can be sure we’re adding Anthropic’s capability to our platform for immediate beta testing :-)

Great article Paul, keep them coming!

Terry Larkin

Technology Leader - I help Companies leverage Technology for Business Results - DevSecOps - Platform Engineering - GenAI - Data Engineering

1 个月

Very informative Paul Chada, this development does open up many new use cases and a catalyst to a new approach to automation.

Stewart Davison MSc

Proptech Thought Provoker & Strategic Advisor | Helping Tech SMEs Navigate Social Housing | NED championing digital transformation in public & private sectors

1 个月

Interesting stuff Paul. This recalls the interview that Ezra Klein did on the Ezra Klein show with Dario Amodei where he was talking about the next phase of LLM evolution. I ended up dubbing it 'The Concierge Model' and use the analogy of getting an LLM to plan, book and execute an anniversary trip to Venice. This stuff is coming on leaps and bounds, thanks the article.

Don Voss MBA

Partner, CFO and Ops Efficiency @ SIGMA ANALYTICS

1 个月

Thanks for sharing this Paul Chada, exciting times ahead. Who could not use some additional time?

要查看或添加评论,请登录

Paul Chada的更多文章

社区洞察

其他会员也浏览了