Creating an AI data journalist with the new OpenAI Assistants API
AI data journalism art by DALL?E 3 (hence words like "inslights")

Creating an AI data journalist with the new OpenAI Assistants API

Yesterday I wrote about OpenAI's Dev Day announcements. One of the things I really wanted to try was creating a custom GPT. Unfortunately, I don't have access to that yet. (OpenAI, if you're reading, please grant it!) I do, however, have access to the new Assistants API, which allows something similar. I created my first application with it, and wanted to share the process. But first...

What is the Assistants API and why should you care?

Imagine a virtual assistant that you could task with an assignment, leave alone, and then come back to later for the output. This is different from ChatGPT, where you're engaged in a back-and-forth dialogue. Rather, it's true delegation.

Some people have tried to build tools for this, like AutoGPT, but in my experience they go off the rails fast and are unreliable.

The Assistants API is a step to addressing this. In a nutshell, you:

  1. Create an assistant. You can do this either programmatically, in code, or via the OpenAI Playground web interface. Importantly, the assistant is persistent. OpenAI stores it for reuse, which makes it almost like having a fine-tuned model available for specific tasks.
  2. Give the assistant an identity, data, and tools. At the very least, you need to give it an identity, the system message. This tells it how you want it to act, and what you want it to do. You can also upload data, such as knowledge you want it to have available for answering questions. Finally, you give it tools, including built in tools like Code Interpreter, and definitions for custom tools you'll write and execute yourself. (Interesting note: OpenAI's documentation says that assistants can use up to 128 tools, which is incredible.)
  3. Start a thread and execute a "run." Unlike the conversational chatbots we've gotten familiar with, the primary purpose of agents seems to be executing tasks in the background. Read on to see how this works.

Creating an AI data journalist

Okay, now, on to the steps! Here's what we're going to be creating:

Data journalist created with the OpenAI Assistants API and Streamlit

Step 1: Create an assistant

I'm going to assume that you already have an OpenAI developer account with an API key. If not, head over there first.

Once you're logged in, click on the "Playground" on the left, then select "Assistants" from the dropdown at the top.

Next, create a new assistant as follows:

  • Name: Data Journalist
  • System message: You are an experienced data journalist. You receive a CSV of data from a user. You write code to find interesting patterns in the data. You choose the most interesting of these patterns and write a 250-word article about them. You write the headline for the article and then the article itself. You do not ask for feedback from the user at any point. You independently look for trends, independently write the article, and then provide the article to the user to review.
  • Code interpreter: Active

Then click "Save."

That's literally it!

The AI data journalist will analyze an uploaded CSV, look for interesting information (like trends), and then write an article about the most interesting things that it finds.

At this point, you can even test it. Just upload a file in the preview box and ask for a data-driven story.

Your assistant should look like this:

OpenAI Playground with a Data Journalist assistant

Step 2: Create Streamlit app

Next, you'll create the app to run the assistant. The good news is that I've copied the code to a public Gist. You can simply click that link, copy the code into a Python file, input your assistant ID, and run it with streamlit run <filename>.py.

Or, you can copy the code from here, which also includes comments to explain how it works:

import os
import time

import openai
import streamlit as st

# Create an OpenAI client with your API key
openai_client = openai.Client(api_key=os.environ.get("OPENAI_API_KEY"))

# Retrieve the assistant you want to use
assistant = openai_client.beta.assistants.retrieve(
    "<assistant_id>"
)

# Create the title and subheader for the Streamlit page
st.title("Data Journalist")
st.subheader("Upload a CSV and get the story within:")

# Create a file input for the user to upload a CSV
uploaded_file = st.file_uploader(
    "Upload a CSV", type="csv", label_visibility="collapsed"
)

# If the user has uploaded a file, start the assistant process...
if uploaded_file is not None:
    # Create a status indicator to show the user the assistant is working
    with st.status("Starting work...", expanded=False) as status_box:
        # Upload the file to OpenAI
        file = openai_client.files.create(
            file=uploaded_file, purpose="assistants"
        )

        # Create a new thread with a message that has the uploaded file's ID
        thread = openai_client.beta.threads.create(
            messages=[
                {
                    "role": "user",
                    "content": "Write an article about this data.",
                    "file_ids": [file.id],
                }
            ]
        )

        # Create a run with the new thread
        run = openai_client.beta.threads.runs.create(
            thread_id=thread.id,
            assistant_id=assistant.id,
        )

        # Check periodically whether the run is done, and update the status
        while run.status != "completed":
            time.sleep(5)
            status_box.update(label=f"{run.status}...", state="running")
            run = openai_client.beta.threads.runs.retrieve(
                thread_id=thread.id, run_id=run.id
            )

        # Once the run is complete, update the status box and show the content
        status_box.update(label="Complete", state="complete", expanded=True)
        messages = openai_client.beta.threads.messages.list(
            thread_id=thread.id
        )
        st.markdown(messages.data[0].content[0].text.value)

        # Delete the uploaded file from OpenAI
        openai_client.files.delete(file.id)        

Bottom line: Creating AI agents that run background tasks just got much easier

The above code creates a web interface that shows a status as the assistant executes its tasks. But the web interface isn't necessary. You could have a different tool that allows you to schedule runs and receive results by email when they're ready.

What's most exciting to me is that we can now run background tasks powered by AI, with access to powerful tools. This opens up a lot more use cases beyond those for which a chat interface is ideal.


Valentin F.

Marketing Specialist | Digital Technologies, Automation, Generative AI, Music Industry, Web3, Retail, Marketing, Content Creation, Video Montage.

1 年

Great read Simon. I like your work. Keep it up!

要查看或添加评论,请登录

Simon Smith的更多文章

社区洞察

其他会员也浏览了