登录查看更多内容

Geek Out Time: Build Your Own Autonomous AI Agent Backed by the Top Open-Source LLM DeepSeek v3 and Browser-Use Web UI-Right in Your Browser

Nedved Yang

发布日期: 2025年1月20日

(Also on Constellar tech blog https://medium.com/the-constellar-digital-technology-blog/geek-out-time-build-your-own-autonomous-ai-agent-backed-by-the-top-open-source-llm-deepseek-v3-and-9d04820f8f6d)

Anthropic’s recently unveiled Claude 3.5 Sonnet, with its innovative ‘computer use’ capability, highlights the growing potential of AI models to interact with environments in human-like ways. DeepSeek v3 (https://huggingface.co/deepseek-ai/DeepSeek-V3), released in January 2025 as the top open-source LLM, takes this a step further with its superb performance in handling complex reasoning tasks. Built for powering advanced autonomous agents, DeepSeek v3 is designed to excel in multi-step workflows and deliver exceptional results across coding, analysis, and problem-solving tasks. In this Geek Out Time, we’ll explore how to combine Browser-Use Web UI with DeepSeek v3 to build a highly capable autonomous AI agent

We want our autonomous AI agent to perform the task: “Go to constellar.co website, find the ‘Contact,’ and retrieve its Singapore office address.”

Step 1: Set Up Browser-Use Web UI

First, use uv to setup the Python environment.

uv venv --python 3.11

and activate it with:

source .venv/bin/activate

Install the dependencies:

uv pip install -r requirements.txt

Then install playwright:

playwright install

Start the Web UI:

python webui.py --ip 127.0.0.1 --port 7788Open the Web UI in your browser at:

Open the WebUI in the browser https://127.0.0.1:7788

Step 2: Configure to use DeepSeek V3

Navigate to the LLM panel to configure using DeepSeek V3:

You will need to get the API key from the https://platform.deepseek.com/

领英推荐

Google's AI Game Engine, Cursor: The AI IDE, 3 R's in…

HackerRank 6 个月前

The 10 Best Examples Of Low-Code And No-Code AI

Bernard Marr 2 年前

Yann LeCun's Future of A.I. Paper

Michael Spencer 2 年前

Step 3: Create and Run the Agent

Navigate to the Web UI to configure the “Browser Settings” and “Run Agent”:

We will ask the Agent to “go to https://www.constellar.co/sg. and click “Contact”, get its Singapore address”.

The AI agent will bring up the browser and perform the task, pls refer to the video recording below,

And you will get the following…. Pretty awesome?

Final Result:

The Singapore address for Constellar is: 1 Expo Drive, #02–01, Singapore 486150.

Model Actions:

[{'go_to_url': {'url': 'https://www.constellar.co/sg'}}, {'click_element': {'index': 4}}, {'done': {'text': 'The Singapore address for Constellar is: 1 Expo Drive, #02-01, Singapore 486150.'}}]

Model Thoughts:

[CustomAgentBrain(prev_action_evaluation='Unknown - No previous actions to evaluate.', important_contents='', completed_contents='', thought="The task requires navigating to the Constellar website, clicking the 'Contact' link, and extracting the Singapore address. The current page is blank, so the first step is to navigate to the specified URL.", summary="Navigate to the Constellar website and proceed to click the 'Contact' link to find the Singapore address."), CustomAgentBrain(prev_action_evaluation='Success - Successfully navigated to the Constellar website.', important_contents='', completed_contents='1. Navigated to https://www.constellar.co/sg.', thought="The next step is to click the 'Contact' link to find the Singapore address.", summary="Click the 'Contact' link to proceed to the contact page and extract the Singapore address."), CustomAgentBrain(prev_action_evaluation="Success - Successfully clicked the 'Contact' link and navigated to the contact page.", important_contents='1 Expo Drive, #02-01\nSingapore 486150', completed_contents="1. Navigated to https://www.constellar.co/sg.\n2. Clicked the 'Contact' link.", thought='The Singapore address has been found on the contact page. The task is now complete.', summary='The Singapore address has been extracted from the contact page.')]

Let’s see what happens on the DeepSeek V3 Side,

There are 32K tokens generated. DeepSeek v3, much like OpenAI’s O3 model, represents a significant leap in reasoning and problem-solving capabilities. Designed for complex tasks like coding, scientific analysis, and logical reasoning, these models excel at breaking down intricate problems into step-by-step solutions. This advanced reasoning often results in the generation of a large number of tokens, as the models explain intermediate steps, provide detailed justifications, or produce verbose outputs for clarity. For instance, in coding tasks, they may include comments, detailed explanations, and multiple iterations of code to ensure correctness. Similarly, in problem-solving scenarios, they generate comprehensive breakdowns of each logical step. These traits, while improving accuracy and adaptability, naturally lead to higher token counts during inference. DeepSeek v3 offers a cost-effective pricing model for token generation, making it an attractive choice for developers and businesses seeking advanced AI capabilities. During the promotional period (ending February 8, 2025), the cost for input tokens is $0.10 per million for cache hits and $1.00 per million for cache misses, while output tokens are priced at $2.00 per million. After the promotional period, the rates will adjust to $0.07 per million for cache hits, $0.27 per million for cache misses, and $1.10 per million for output tokens. You don’t need a deep pocket to try..

Conclusion

DeepSeek v3, with its advanced reasoning capabilities and ability to handle complex tasks, shines in the field of AI-driven automation. By integrating it with Browser-Use Web UI, we can unlock the potential of autonomous AI agents that seamlessly interact with the web, perform multi-step workflows, and generate detailed, context-aware outputs. These capabilities make it an excellent candidate for near-term applications like automated UI testing, where the agent can mimic user behavior to validate interfaces and workflows with precision and speed. Beyond that, the possibilities are vast — from data scraping and intelligent customer support to dynamic research tools and autonomous content creation.

While the generation of more tokens may increase inference time, it is a testament to the model’s depth of understanding and logical precision, ensuring accurate and reliable outcomes. As AI technology continues to evolve, powerful LLMs like DeepSeek v3 redefine the boundaries of what automation can achieve, paving the way for innovative solutions across industries. The future of autonomous AI agents is here, and it’s smarter, faster, and more adaptive than ever.

Try it yourself and automate your tasks with ease! Let me know how it works for you in the comments and have fun! ??

Richard H. Li

Partner at know.haus | Helping B2C brands and PE-backed companies scale with AI-driven marketing & personalization

1 个月

Nice write up bro! Like the beard ;)

1 次回应

查看更多评论

要查看或添加评论，请登录

Nedved Yang的更多文章

Geek Out Time: Trying newly released OpenAI’s Responses API with Web Search Tool in Google Colab

2025年3月17日

Geek Out Time: Trying newly released OpenAI’s Responses API with Web Search Tool in Google Colab

(Also on Constellar tech blog:…

1 条评论
Geek Out Time: Building a Multi-Agent Financial Advisor Copilot with AG2 (formerly AutoGen), OpenAI, and DeepSeek LLM

2025年3月3日

Geek Out Time: Building a Multi-Agent Financial Advisor Copilot with AG2 (formerly AutoGen), OpenAI, and DeepSeek LLM

(Also on Constellar tech blog…

2 条评论
Geek Out Time: Simulating Distributed Training on TPU & GPU in Google Colab

2025年2月24日

Geek Out Time: Simulating Distributed Training on TPU & GPU in Google Colab

(Also on Constellar tech blog…
Geek Out Time: “Vibe Coding” on Google Colab with OpenAI & DeepSeek

2025年2月17日

Geek Out Time: “Vibe Coding” on Google Colab with OpenAI & DeepSeek

(Also on Constellar tech blog…

2 条评论
Geek Out Time: Mixture of Experts(MoE) vs. CNN: A Google Colab Experiment

2025年2月10日

Geek Out Time: Mixture of Experts(MoE) vs. CNN: A Google Colab Experiment

(Also on Constellar tech blog…

4 条评论
Geek Out Time: Knowledge Distillation in TensorFlow- Smaller, Smarter Models in Google Colab

2025年2月4日

Geek Out Time: Knowledge Distillation in TensorFlow- Smaller, Smarter Models in Google Colab

(Also on Constellar tech blog…
Geek Out Time: AI Model Routing — Dynamically Choose Models Based on Question Complexity

2025年1月13日

Geek Out Time: AI Model Routing — Dynamically Choose Models Based on Question Complexity

(Also on Constellar tech blog…
Geek Out Time: AI in the Browser- Run WebLLM for Powerful, Local LLM Experiences

2024年12月23日

Geek Out Time: AI in the Browser- Run WebLLM for Powerful, Local LLM Experiences

(Also on Constellar tech blog https://nedvedyang.medium.

1 条评论
Geek Out Time: Exploring Opensource AnythingLLM — The All-in-One, Easy AI Platform for Local RAG and Intelligent Agents with Just a Click

2024年12月9日

Geek Out Time: Exploring Opensource AnythingLLM — The All-in-One, Easy AI Platform for Local RAG and Intelligent Agents with Just a Click

(Also on Constellar tech blog…

3 条评论
Geek Out Time: Exploring LoRA on Google Colab: the Challenges of Base Model Upgrades

2024年12月6日

Geek Out Time: Exploring LoRA on Google Colab: the Challenges of Base Model Upgrades

(Also on Constellar tech blog…

See all articles

Geek Out Time: Build Your Own Autonomous AI Agent Backed by the Top Open-Source LLM DeepSeek v3 and Browser-Use Web UI-Right in Your Browser

Nedved Yang

Step 1: Set Up Browser-Use Web UI

Step 2: Configure to use DeepSeek V3

领英推荐

Step 3: Create and Run the Agent

Conclusion

Nedved Yang的更多文章

社区洞察

其他会员也浏览了

Google Bard and the Age of Robots

The AI Search Engine War And Other AI News

OpenAI's New 'Swarm' Framework: A Game-Changer for AI, But a Job Killer? Analytics Insight

Monthly Tech News Digest: Search Engine Battles and the Latest from Robotics

The Artificial Investor - Issue 43: The Web AI Agent era

Daily Dose of Tech | 2024-02-16

Evolution of Prompt Engineering, AI Adoption Challenges

Choosing Best-Fit Embeddings for Your AI App (OpenAI, Mistral, Llama, etc)

OpenAI Expands AI Web Search!

OpenAI DevDay Highlights: Here's What You Missed!

Step 1: Set Up Browser-Use Web UI

Step 2: Configure to use DeepSeek V3

领英推荐

Step 3: Create and Run the Agent

Conclusion

Nedved Yang的更多文章

Geek Out Time: Trying newly released OpenAI’s Responses API with Web Search Tool in Google Colab

Geek Out Time: Building a Multi-Agent Financial Advisor Copilot with AG2 (formerly AutoGen), OpenAI, and DeepSeek LLM

Geek Out Time: Simulating Distributed Training on TPU & GPU in Google Colab

Geek Out Time: “Vibe Coding” on Google Colab with OpenAI & DeepSeek

Geek Out Time: Mixture of Experts(MoE) vs. CNN: A Google Colab Experiment

Geek Out Time: Knowledge Distillation in TensorFlow- Smaller, Smarter Models in Google Colab

Geek Out Time: AI Model Routing — Dynamically Choose Models Based on Question Complexity

Geek Out Time: AI in the Browser- Run WebLLM for Powerful, Local LLM Experiences

Geek Out Time: Exploring Opensource AnythingLLM — The All-in-One, Easy AI Platform for Local RAG and Intelligent Agents with Just a Click

Geek Out Time: Exploring LoRA on Google Colab: the Challenges of Base Model Upgrades

社区洞察

其他会员也浏览了

Google Bard and the Age of Robots

The AI Search Engine War And Other AI News

OpenAI's New 'Swarm' Framework: A Game-Changer for AI, But a Job Killer? Analytics Insight

Monthly Tech News Digest: Search Engine Battles and the Latest from Robotics

The Artificial Investor - Issue 43: The Web AI Agent era

Daily Dose of Tech | 2024-02-16

Evolution of Prompt Engineering, AI Adoption Challenges

Choosing Best-Fit Embeddings for Your AI App (OpenAI, Mistral, Llama, etc)

OpenAI Expands AI Web Search!

OpenAI DevDay Highlights: Here's What You Missed!