22-1-1 Connecting LLMs with the Web

22-1-1 Connecting LLMs with the Web

Large Language Models (LLMs) often face a challenge in staying current with world knowledge due to the cost and expertise needed for pre-training. By their release, they can be somewhat outdated. A prime example is OpenAI's GPT-4, which has various versions indicating their release dates, such as gpt-4-0314(legacy), gpt-4-0613, and gpt-4-1106-preview (also known as GPT-4-Turbo).

Continuous updating of these models is resource-intensive, making it hard for many companies to keep their models up-to-date. To mitigate this, there are several approaches when faced with LLMs and Real-Time Information from the Web:

  1. Integration (APIs and Tools): Combining LLMs with custom applications that interface with third-party endpoints, like weather or Google Maps APIs, is a basic yet effective solution. This helps in situations where the LLM's parametric memory may not have the most recent data.
  2. Managed Web-Search Enabled LLMs: Some LLMs come with built-in web search capabilities. This is particularly useful for accessing information that isn't readily available through standard APIs.
  3. Highly Custom (Agent-based) Solutions: Employing LLMs alongside web scrapers or browsers that mimic human search behavior offers a more advanced, though complex, solution.

For instance, in my project integrating PubMed articles with an LLM to generate accurate summaries, I encountered the issue of the LLM's tendency to hallucinate due to outdated or incomplete information. I initially considered using the PubMed API to feed the LLM recent data. However, for broader searches related to substances, clinical trials, or conferences not covered by PubMed, I explored using search engine queries to supply relevant information to the LLM.

For specific tasks like detailing a particular conference or Congress, a custom scraper that fetches and parses the event's webpage needs to be implemented. This demonstrates that depending on the use case and desired outcome (and cost), a variety of methods may be necessary to leverage real-time web content effectively for LLMs to provide contextually relevant responses.


Let's explore in code the first approach to connecting with the web.

We will create a simple Python script to parse latest weather information based on the city name. To obtain the API tokens go to: Members (openweathermap.org)

Sign up and generate a key:

import os
import requests
import openai        
# Set your OpenAI and OpenWeatherMap API keys
OPENAI_API_KEY = 'YOUR_OPENAI_KEYS'
openweathermap_api_key = 'WEATHER_API_KEYS'        

First import the necessary libraries and the OpenAI and Openweather API keys.

Then initialise the OpenAI client:

client = OpenAI(api_key=OPENAI_API_KEY)        

Let's examine how this app would work

  1. User Input: The user enters the name of a city.
  2. Fetching Coordinates (Geocoding): The application uses the OpenWeatherMap Geocoding API to convert the city name into geographical coordinates (latitude and longitude).
  3. Retrieving Weather Data: With the coordinates obtained, the application fetches current weather data from the OpenWeatherMap API.
  4. Processing with OpenAI LLM: The raw weather data is then processed by OpenAI's GPT-3.5-turbo model. The model generates a human-friendly weather report, including practical advice based on the weather conditions.

For the sake of this tutorial, I am going to assume there is no available libraries I can use to connect with weather APIs.

Key Functions:

### get_location_coordinates:

### Takes the city name and API key as inputs. Makes an API call to ### get the latitude and longitude of the city. 

def get_location_coordinates(city_name, api_key, country_code=None, state_code=None):
    print(f"Fetching coordinates for {city_name}...")  # Debug line
    # Compose API request URL
    base_url = "https://api.openweathermap.org/geo/1.0/direct?"
    query = f"q={city_name}"
    if country_code:
        query += f",{country_code}"
    if state_code:
        query += f",{state_code}"
    query += f"&limit=1&appid={api_key}"
    complete_url = base_url + query
    # Make API call
    response = requests.get(complete_url)
    data = response.json()
    if data:
        print(f"Coordinates for {city_name}: {data[0]['lat']}, {data[0]['lon']}")  
        return data[0]['lat'], data[0]['lon']
    else:
        print("No data found for specified location.") 
        return None, None
        
### get_weather_by_coords:

### The above function accepts latitude, longitude, and API key
### Fetches weather data from OpenWeatherMap using these
### coordinates. ### Returns the raw weather data.

def get_weather_by_coords(latitude, longitude, api_key):
    print(f"Fetching weather data for coordinates: {latitude}, {longitude}...")  
    # Compose API request URL
    base_url = "https://api.openweathermap.org/data/2.5/weather?"
    complete_url = f"{base_url}lat={latitude}&lon={longitude}&appid={api_key}"
    # Make API call
    response = requests.get(complete_url)
    weather_data = response.json()
    print("Weather data received.")  
    return weather_data        
def generate_weather_report(city_name, api_key):
    print(f"Generating weather report for {city_name}...") 
    # Get city coordinates
    latitude, longitude = get_location_coordinates(city_name, api_key)
    if latitude is None or longitude is None:
        return "City not found."

    # Get weather data using coordinates
    weather_data = get_weather_by_coords(latitude, longitude, api_key)
    print(weather_data)

    # Check if weather data is valid
    if 'weather' in weather_data and 'main' in weather_data:
        # Extract necessary data
        weather_description = weather_data['weather'][0]['description']
        temperature = weather_data['main']['temp']
        temp_max = weather_data['main']['temp_max']
        wind_speed = weather_data['wind']['speed']
        wind_deg = weather_data['wind']['deg']

        print("Preparing data for LLM...")  
        # Create chat messages for the LLM
        messages = [
            {"role": "system", "content": "You are a weather AI assistant. Your task is to provide a concise weather report based on given data. And provide advice"},
            {"role": "user", "content": f"The current weather in {city_name} is: {weather_description}. The temperature is {temperature} K, with a high of {temp_max} K. Wind speed is {wind_speed} m/s at {wind_deg} degrees."}
        ]

        print("Requesting LLM to generate report...")  
        # Generate the weather report using OpenAI's GPT-3.5-turbo model
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=messages
        )

        response_message = response.choices[0].message.content
        return response_message
    else:
        print("Weather data not available.")  
        return "Weather data not available."        

generate_weather_report:

This function combines the above functionalities. Uses the city name to get coordinates and then weather data. Formats the data and sends it to the LLM. The LLM generates a detailed report, which is returned to the user.

# Example usage
city_name = input("Enter city name: ")
report = generate_weather_report(city_name, openweathermap_api_key)
print("Current Weather Report:", report)  # Display the final report        

Upon entering Singapore in the text box:

Current Weather Report: The current weather in Singapore is broken clouds with a temperature of 27.70°C (81.86°F). The high today will reach 29.48°C (85.06°F). The wind is blowing at 7.2 m/s (25.92 km/h) from the east-northeast. Advice: It's a warm and slightly cloudy day in Singapore. You'll want to dress comfortably and stay hydrated as the temperature is quite high. The wind speed is moderate, so you may want to carry an umbrella or a hat to protect yourself from the sun. Enjoy your day!

We developed a mini application leveraging Large Language Model (LLM) designed to analyse the latest weather data and generate easily understandable reports, complete with AI-customised advice.

Note that Python libraries such as PyOWM can help streamline the process of querying OpenWeatherMap APIs than what was depicted above.

Let's explain a simpler approach by using Langchain.

  1. First install and import langchain and pyowm

!pip install langchain
!pip install pyowm        
import os

from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.llms import OpenAI        

2. Also like we did before, set the API keys for OpenAI and the WeatherMap

os.environ["OPENAI_API_KEY"] = "YOUR OPENAI KEY"
os.environ["OPENWEATHERMAP_API_KEY"] = "YOUR WEATHERAPI KEY"        

3. Then we initialise OpenAI client and load the openweathermap tool.

llm = OpenAI(temperature=0)

tools = load_tools(["openweathermap-api"], llm)        

4. In essence, we are creating an agent powered by OpenAI LLM and allowing the agent to use a tool named openweathermap.

agent_chain = initialize_agent(
    tools=tools, llm=llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)        

5. Now all we have to do is query the city and what details we want using natural language.

agent_chain.run("What's the weather like in Singapore")        

In just under 20 lines of code, we successfully reached our goal leveraging Langchain.

Typically, when integrating LLMs with real-time information, it's prudent to first explore the available options.

  • What specific real-time data is required?
  • Can this data be sourced through official or third-party APIs?
  • Are there existing libraries or wrappers to facilitate this, such as PyOWM?
  • Is the desired tool supported in Langchain?

In our upcoming article, we'll delve into the tools and techniques for accessing data beyond the scope of API calls, utilising search engines to broaden our data acquisition capabilities.

Link to code (replace Key details): https://colab.research.google.com/drive/1oorNdl1mvCP8vTeCcZxRRr_cXDp0SUea?usp=sharing











要查看或添加评论,请登录

Won Bae Suh的更多文章

社区洞察

其他会员也浏览了