Advanced prompting techniques: Code Interpreter
Ghost operates in an industry where turnarounds are pretty quick. Sometimes, listings can be gone in a matter of minutes! It's in our best interest to optimize our listing process so that items can be listed on the marketplace very quickly.
In theory, this should be straightforward –?just offer an intuitive flow to sellers, and voila! Well…?That doesn't work for us, and here's why:
So far, our team has resorted to a combination of traditional prompting techniques and process automation to speed up our listing process. The results were promising but we didn't significantly increase our speed. Besides, these initial attempt was heavy on the code, which didn't make our solution suitable for people with limited coding skills.
Thankfully, OpenAI introduced Code Interpreter earlier in 2023. In short, Code Interpreter allows you to define assistants that can write their own code, and execute it in a sandboxed, firewalled environment. It works beautifully in our context:
Prompting for Code Interpreter
Prompting for Code Interpreter is not different than prompting for other contexts. In fact, the only difference is in how we explicitly create an Assistant for that purpose:
file = client.files.create(
file=open('listing.csv', 'rb'),
purpose='assistants'
)
instructions="""
You are a data entry operator who is tasked to reorganize the information of a CSV file provided by the user. Do your best to reorganize the information based on the user input, but do not ask for the user's help.
"""
assistant = client.beta.assistants.create(
instructions=instructions,
model="gpt-4-1106-preview",
tools=[{"type": "code_interpreter"}],
file_ids=[file.id]
)
Here we first uploaded a file containing our listing data, so it can be referenced by the Assistant. We then created a new Assistant, and we gave it code_interpreter tool it can use. This tool comes with no argument specification; since Code Interpreter is built into the model, the Assistant will already know how to invoke it.
领英推荐
As far as prompting goes, here's how we're going to proceed
Here's a simplified version of the initial version of our prompt:
prompt = f"""I have a file I'll need to reorganize in a different format. The input file will contains product information, such as:
{columns}
The file can contain additional information that can be relevant to the task, such as:
{file_specific_info}
Give then input file, create an output file with the following columns:
{output_columns}
Do the following for each row:
- Check if each cell in the row contains the above information. If so, start extracting this information and add them in the right column. For example, if the product description contains the item's brand, move the brand in the Brand column for that item.
- Check if the product description or any other information in the row contains brand information. For example, the product description may contain a brand name. If you find a brand in that row, add it to the Brand Column.
- Check if a URL is present in the row, and check is formatted in a way that seems to be pointing to an image. If that's the case, add it to the Image URL column for that row. If not, leave blank.
- Read the SKU Title or product description carefully, then infer the value for Category Group from the information available in that same row, then choose one of these output values:
{taxonomy.category_groups}
If you do not have enough information to determine a Category Group for an item, leave its output value empty.
- For each row, infer the value for Department from the information available in that same row, then choose one of these output values:
{taxonomy.departments}
Leave Department empty if Category Group is Apparel or Footwear. Also leave empty if you do not have enough information determine its output value.
- If sizes are provided, translate each size to one of the following values, or leave empty if you cannot determine the right sizing:
{taxonomy.sizes}
Note that the input file may list multiple sizes for the same item. If that's the case, separate each size into its own row for the same item.
- If two rows end up having the same combination of SKU Title, Size, Price, and MSRP, append one or more additional information (such as Style or SKU) to the Style column so that it creates an unique value.
Some other suggestions for you:
Try to understand what type of data each input column contains, then map it to the relevant output column. When necessary, make assumptions using both conventional knowledge and specific knowledge of the retail industry.
If some values are not available, or if you cannot determine them with confidence, simply leave them blank.
The output should be an Excel file."""
Few shots, less flaws
Our basic prompt structure only provides task context, inputs, and the immediate task description. Even so, the prompt is fairly accurate at determining input and output data, as well as applying some heuristics to infer missing information like categories and style descriptions, when missing.
In our tests, a few-shots variant of the prompt performs significantly better, both in terms of accuracy and latency.
It takes time to gain time
It may sound counterintuitive, but it takes time to shave time off our processes. And that's okay: in our case, prompt engineering is a fixed time cost that pays off dividends in a short amount of time. And with the time we saved in making our processes more efficient, we were able to scale our operations, which in turn freed up more bandwidth to optimize our prompts.
Business Development & Partner Success Manager - Marketplace Strategy & Operations - Shopify Expert - Omnichannel Solutions
1 年Very Insightful thanks for sharing