Geek Out Time: Build Your Own Autonomous AI Agent Backed by the Top Open-Source LLM DeepSeek v3 and Browser-Use Web UI-Right in Your Browser

Geek Out Time: Build Your Own Autonomous AI Agent Backed by the Top Open-Source LLM DeepSeek v3 and Browser-Use Web UI-Right in Your Browser

(Also on Constellar tech blog https://medium.com/the-constellar-digital-technology-blog/geek-out-time-build-your-own-autonomous-ai-agent-backed-by-the-top-open-source-llm-deepseek-v3-and-9d04820f8f6d)

Anthropic’s recently unveiled Claude 3.5 Sonnet, with its innovative ‘computer use’ capability, highlights the growing potential of AI models to interact with environments in human-like ways. DeepSeek v3 (https://huggingface.co/deepseek-ai/DeepSeek-V3), released in January 2025 as the top open-source LLM, takes this a step further with its superb performance in handling complex reasoning tasks. Built for powering advanced autonomous agents, DeepSeek v3 is designed to excel in multi-step workflows and deliver exceptional results across coding, analysis, and problem-solving tasks. In this Geek Out Time, we’ll explore how to combine Browser-Use Web UI with DeepSeek v3 to build a highly capable autonomous AI agent

We want our autonomous AI agent to perform the task: “Go to constellar.co website, find the ‘Contact,’ and retrieve its Singapore office address.”

Step 1: Set Up Browser-Use Web UI

First, use uv to setup the Python environment.

uv venv --python 3.11        

and activate it with:

source .venv/bin/activate        

Install the dependencies:

uv pip install -r requirements.txt        

Then install playwright:

playwright install        

Start the Web UI:

python webui.py --ip 127.0.0.1 --port 7788Open the Web UI in your browser at:        

Open the WebUI in the browser https://127.0.0.1:7788

Step 2: Configure to use DeepSeek V3

Navigate to the LLM panel to configure using DeepSeek V3:

You will need to get the API key from the https://platform.deepseek.com/

Step 3: Create and Run the Agent

Navigate to the Web UI to configure the “Browser Settings” and “Run Agent”:

We will ask the Agent to “go to https://www.constellar.co/sg. and click “Contact”, get its Singapore address”.

The AI agent will bring up the browser and perform the task, pls refer to the video recording below,

And you will get the following…. Pretty awesome?

Final Result:

The Singapore address for Constellar is: 1 Expo Drive, #02–01, Singapore 486150.        

Model Actions:

[{'go_to_url': {'url': 'https://www.constellar.co/sg'}}, {'click_element': {'index': 4}}, {'done': {'text': 'The Singapore address for Constellar is: 1 Expo Drive, #02-01, Singapore 486150.'}}]        

Model Thoughts:

[CustomAgentBrain(prev_action_evaluation='Unknown - No previous actions to evaluate.', important_contents='', completed_contents='', thought="The task requires navigating to the Constellar website, clicking the 'Contact' link, and extracting the Singapore address. The current page is blank, so the first step is to navigate to the specified URL.", summary="Navigate to the Constellar website and proceed to click the 'Contact' link to find the Singapore address."), CustomAgentBrain(prev_action_evaluation='Success - Successfully navigated to the Constellar website.', important_contents='', completed_contents='1. Navigated to https://www.constellar.co/sg.', thought="The next step is to click the 'Contact' link to find the Singapore address.", summary="Click the 'Contact' link to proceed to the contact page and extract the Singapore address."), CustomAgentBrain(prev_action_evaluation="Success - Successfully clicked the 'Contact' link and navigated to the contact page.", important_contents='1 Expo Drive, #02-01\nSingapore 486150', completed_contents="1. Navigated to https://www.constellar.co/sg.\n2. Clicked the 'Contact' link.", thought='The Singapore address has been found on the contact page. The task is now complete.', summary='The Singapore address has been extracted from the contact page.')]        

Let’s see what happens on the DeepSeek V3 Side,

There are 32K tokens generated. DeepSeek v3, much like OpenAI’s O3 model, represents a significant leap in reasoning and problem-solving capabilities. Designed for complex tasks like coding, scientific analysis, and logical reasoning, these models excel at breaking down intricate problems into step-by-step solutions. This advanced reasoning often results in the generation of a large number of tokens, as the models explain intermediate steps, provide detailed justifications, or produce verbose outputs for clarity. For instance, in coding tasks, they may include comments, detailed explanations, and multiple iterations of code to ensure correctness. Similarly, in problem-solving scenarios, they generate comprehensive breakdowns of each logical step. These traits, while improving accuracy and adaptability, naturally lead to higher token counts during inference. DeepSeek v3 offers a cost-effective pricing model for token generation, making it an attractive choice for developers and businesses seeking advanced AI capabilities. During the promotional period (ending February 8, 2025), the cost for input tokens is $0.10 per million for cache hits and $1.00 per million for cache misses, while output tokens are priced at $2.00 per million. After the promotional period, the rates will adjust to $0.07 per million for cache hits, $0.27 per million for cache misses, and $1.10 per million for output tokens. You don’t need a deep pocket to try..

Conclusion

DeepSeek v3, with its advanced reasoning capabilities and ability to handle complex tasks, shines in the field of AI-driven automation. By integrating it with Browser-Use Web UI, we can unlock the potential of autonomous AI agents that seamlessly interact with the web, perform multi-step workflows, and generate detailed, context-aware outputs. These capabilities make it an excellent candidate for near-term applications like automated UI testing, where the agent can mimic user behavior to validate interfaces and workflows with precision and speed. Beyond that, the possibilities are vast — from data scraping and intelligent customer support to dynamic research tools and autonomous content creation.

While the generation of more tokens may increase inference time, it is a testament to the model’s depth of understanding and logical precision, ensuring accurate and reliable outcomes. As AI technology continues to evolve, powerful LLMs like DeepSeek v3 redefine the boundaries of what automation can achieve, paving the way for innovative solutions across industries. The future of autonomous AI agents is here, and it’s smarter, faster, and more adaptive than ever.

Try it yourself and automate your tasks with ease! Let me know how it works for you in the comments and have fun! ??

Richard H. Li

Partner at know.haus | Helping B2C brands and PE-backed companies scale with AI-driven marketing & personalization

1 个月

Nice write up bro! Like the beard ;)

要查看或添加评论,请登录

Nedved Yang的更多文章

社区洞察

其他会员也浏览了