Geek Out Time: Trying newly released OpenAI’s Responses API with Web Search Tool in Google Colab

Geek Out Time: Trying newly released OpenAI’s Responses API with Web Search Tool in Google Colab

(Also on Constellar tech blog: https://medium.com/the-constellar-digital-technology-blog/geek-out-time-trying-newly-released-openais-responses-api-with-web-search-tool-in-google-colab-73b4ceab695b)

OpenAI recently introduced the Web Search Tool as part of its Responses API, (https://openai.com/index/new-tools-for-building-agents/) allowing AI models to search the web, retrieve real-time information, and summarize it.

For those of us who have been using Selenium in Google Colab to automate searches and extract insights, this new approach is a faster, more seamless experience for AI-driven research and knowledge retrieval.

So, I had to try out OpenAI’s new Responses API with Web Search Tool in Google Colab, and then compare it with the traditional Selenium approach.

Using OpenAI’s Responses API with Web Search Tool in Google Colab

Unlike Selenium, which requires browser automation, OpenAI’s Web Search Tool lets AI search the web directly via an API call.

Step 1: Install the OpenAI SDK in Google Colab

First, make sure you have the latest OpenAI package installed:

!pip install --upgrade openai        

Step 2: Set Up OpenAI API Key and Client

Since Google Colab is a shared environment, it’s best to use an environment variable for your API key:

import openai
import os
# Set up OpenAI API key
os.environ["OPENAI_API_KEY"] = "your-api-key-here"
client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))        

Step 3: Search the Web and Summarize Results

Now, let’s use the Responses API to perform a real-time web search and get summarized results:

def search_and_summarize(query):
    """Uses OpenAI’s Responses API to search the web and summarize results."""
    try:
        response = client.responses.create(
            model="gpt-4o",
            tools=[{"type": "web_search_preview"}],  # Enables web search
            input=query
        )

# Extract AI-generated response
        ai_response = response.output_text
        # Extract citations (if available)
        citations = []
        if hasattr(response, "annotations"):
            citations = response.annotations
        # Print results
        print("\nAI Summary:\n", ai_response)
        if citations:
            print("\nCitations:")
            for citation in citations:
                if citation.get("type") == "url_citation":
                    print(f"- {citation['url_citation']['title']} ({citation['url_citation']['url']})")
        return ai_response, citations
    except Exception as e:
        print(f"Error in AI workflow: {e}")
        return None, None
# Example usage
query = "latest AI trends in 2025"
summary, citations = search_and_summarize(query)        

  • The AI searched the web for the latest information.
  • It retrieved and summarized key insights.
  • It provided citations, ensuring transparency.

This method runs entirely within the API, making it faster and more seamless for research.

Trying the Selenium-Based Method in Google Colab

We can also use Selenium in Google Colab to automate Google searches, navigate pages, and extract data.

# ==============================
# STEP 1: INSTALL DEPENDENCIES
# ==============================
!apt-get update
!apt-get install -y wget
!wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
!apt install -y ./google-chrome-stable_current_amd64.deb
!apt-get install -y chromium-chromedriver
!pip install --upgrade selenium webdriver-manager beautifulsoup4 openai

# ==============================
# STEP 2: CONFIGURE CHROME & SELENIUM
# ==============================
import os
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Configure Selenium options
options = Options()
options.binary_location = "/usr/bin/google-chrome"  # Use Chrome instead of Chromium
options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--disable-gpu")
options.add_argument("--remote-debugging-port=9222")
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_argument("--disable-extensions")
options.add_argument("--disable-infobars")
options.add_argument("--disable-notifications")
options.add_argument("--disable-popup-blocking")
options.add_argument("--disable-software-rasterizer")
options.add_argument("--disable-web-security")
options.add_argument("--ignore-certificate-errors")
options.add_argument("--log-level=3")
options.add_argument("--start-maximized")
options.add_argument("--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36")

# Initialize WebDriver using a Service object with webdriver_manager
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service, options=options)

# Test if Chrome is working
driver.get("https://www.google.com")
print("? Chrome is working!")

# ==============================
# STEP 3: SET UP OPENAI API
# ==============================
from openai import OpenAI
import getpass

# Initialize OpenAI client
client = OpenAI(api_key=getpass.getpass("Enter your OpenAI API key: "))

# ==============================
# STEP 4: DEFINE AI-POWERED WEB TOOLS
# ==============================
import time
from bs4 import BeautifulSoup

def search_google(query):
    """Search Google and return the first result URL"""
    try:
        driver.get("https://www.google.com")
        
        # Debug: Print page source or take a screenshot
        print(driver.page_source)
        driver.save_screenshot("google_search.png")
        
        # Wait for the search box to be present and interact with it
        search_box = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.NAME, "q"))
        )
        search_box.send_keys(query)
        search_box.send_keys(Keys.RETURN)
        
        # Debug: Print page source or take a screenshot
        print(driver.page_source)
        driver.save_screenshot("google_results.png")
        
        # Wait for the first result to be present and click it
        first_result = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.CSS_SELECTOR, "h3"))
        )
        first_result.click()
        
        # Wait for the new page to load
        WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.TAG_NAME, "body"))
        )
        return driver.current_url
    except Exception as e:
        print(f" Error during Google search: {e}")
        return None

def extract_text(url):
    """Extract text from a webpage"""
    try:
        driver.get(url)
        WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.TAG_NAME, "body"))
        )
        
        soup = BeautifulSoup(driver.page_source, "html.parser")
        return soup.get_text()[:4000]  # Limit to 4000 characters
    except Exception as e:
        print(f" Error extracting text: {e}")
        return None

def summarize_text(text):
    """Summarize extracted text using GPT-4"""
    try:
        response = client.chat.completions.create(
            model="gpt-4",  # Use "gpt-4" or "gpt-3.5-turbo" if "gpt-4o" is not available
            messages=[
                {"role": "system", "content": "Summarize the given text."},
                {"role": "user", "content": text}
            ]
        )
        return response.choices[0].message.content
    except Exception as e:
        print(f" Error summarizing text: {e}")
        return None

# ==============================
# STEP 5: AI AGENT WITH OPENAI RESPONSES API
# ==============================
# Define tools (functions) that the AI agent can call dynamically
tools = [
    {
        "type": "function",
        "function": {
            "name": "search_google",
            "description": "Search Google and return the first result URL",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "The search query"}
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "extract_text",
            "description": "Extract text from a webpage",
            "parameters": {
                "type": "object",
                "properties": {
                    "url": {"type": "string", "description": "The URL of the webpage"}
                },
                "required": ["url"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "summarize_text",
            "description": "Summarize extracted text using GPT-4",
            "parameters": {
                "type": "object",
                "properties": {
                    "text": {"type": "string", "description": "The text to summarize"}
                },
                "required": ["text"]
            }
        }
    }
]

# Let the AI agent handle the workflow
try:
    response = client.chat.completions.create(
        model="gpt-4",  # Use "gpt-4" or "gpt-3.5-turbo" if "gpt-4o" is not available
        messages=[{"role": "user", "content": "Find the latest AI trends and summarize the top result."}],
        tools=tools,  # Provide the AI tools
        tool_choice="auto"  # Let the AI decide which tool to use
    )

    # Execute AI-selected tool calls
    tool_calls = response.choices[0].message.tool_calls
    for call in tool_calls:
        if call.function.name == "search_google":
            query = "latest AI trends in 2025"
            url = search_google(query)
            if url:
                print(f"?? AI searched Google: {url}")
        elif ....

except Exception as e:
    print(f"? Error during AI workflow execution: {e}")

# Close the browser session
driver.quit()
print("? Browser session closed.")        

Observations

  • OpenAI’s Web Search Tool is easier to use in Google Colab since it doesn’t require Chrome or Selenium.
  • The Selenium method is useful for navigating dynamic pages but requires more setup.
  • For an AI agent use case, where the goal is to search, retrieve, and summarize content, OpenAI’s Web Search Tool provides a more seamless experience.

Thoughts

Trying OpenAI’s Web Search Tool in Google Colab showed that it’s a faster and more efficient way to retrieve real-time information. Compared to Selenium, it requires less setup, runs faster, and provides structured results with citations. That said, Selenium remains valuable when AI needs to interact with websites dynamically. But for AI-driven research, automation, and content generation, OpenAI’s approach is a natural fit for AI agents.

What’s Next?

With OpenAI’s Agents SDK now available, the next step could be building a fully autonomous research agent that combines Web Search, document retrieval, and reasoning…. Happy coding and have fun !

Hunter Mefford

Co-Chief Operating Officer @ Advanced Recovery Systems | Overseeing AI, Marketing, Technology, HR

1 周

Trying out OpenAI's Responses API with a Web Search Tool in Google Colab sounds like a fascinating experiment! The seamless integration and ease of use are impressive. As someone who has built a deep research service using AI Agents, I can see how this could enhance AI-agent use cases significantly. Have you considered exploring OmniParser/OmniBox for further AI-powered research?

要查看或添加评论,请登录

Nedved Yang的更多文章

社区洞察

其他会员也浏览了