OpenAI API Guide: Using JSON Mode
PROMPT: A digital illustration of a guardrail on the side of a highway in the Rocky Mountains at sunrise, focusing on the picturesque landscape.

OpenAI API Guide: Using JSON Mode

This is an advanced how-to focusing and how I built the GuardRail system using the new JSON mode for OpenAI API

OpenAI’s API now features a JSON mode, streamlining response structuring and enhancing integration capabilities. As a practical example, I’ve developed GuardRail, an open-source project utilizing this mode, showcasing how JSON-formatted outputs can significantly improve system interactions and data processing in OpenAI applications.

A Few Practical Uses for JSON Mode:

  1. Automated Data Analysis: JSON mode is ideal for applications that require automated analysis of large datasets, such as customer feedback analysis, market research, or social media monitoring.
  2. Enhanced Chatbots and Virtual Assistants: Integrating JSON mode allows for more structured and nuanced responses, improving the quality of interactions in chatbots and virtual assistants across customer service, healthcare, and e-commerce platforms.
  3. Personalized Content Recommendations: JSON mode can be used in content recommendation systems to parse user preferences and feedback efficiently, leading to more accurate and personalized content suggestions.
  4. Natural Language Processing (NLP) Tasks: For tasks like sentiment analysis, language translation, or summarization, JSON mode provides a structured way to receive and process large volumes of text data.

Enabling JSON Mode

To enable JSON mode, set the response_format parameter to { "type": "json_object" }. This configuration is crucial for receiving outputs in JSON format.

Important Notes:

  • Explicit JSON Instructions: When using JSON mode, explicitly instruct the model to output JSON in your prompts. Without this, the output may consist of endless whitespace or appear ‘stuck’.
  • Truncated Outputs: Be aware that outputs might be partially cut off if finish_reason is “length”, indicating that the generation exceeded the token limit.

Seed Parameter (Beta Feature)

  • Deterministic Sampling: The seed parameter allows for deterministic results. Repeated requests with the same seed and parameters should yield identical outcomes.
  • Not Guaranteed: Determinism is not guaranteed. Monitor changes using the system_fingerprint response parameter.

Sample Code for Data Analysis & Guidance (based on GuardRail)

Below is an example script (prompts.py) for various analysis types using the JSON mode. It includes definitions for different analysis types and corresponding JSON schemas.

# OpenAI Data Analysis & Guiderails Script
# prompts.py - by @rUv

# Analysis Types with Descriptions
# These descriptions define what each analysis type does.
ANALYSIS_TYPES = {
    "sentiment_analysis": "Analyze the sentiment of the provided text. Determine whether the sentiment is positive, negative, or neutral and provide a confidence score.",
    "text_summarization": "Summarize the provided text into a concise version, capturing the key points and main ideas."
    # Add more analysis types as needed
}

# JSON Schemas for Each Analysis Type
# These schemas define the JSON structure for each analysis type's output.
JSON_SCHEMAS = {
    "sentiment_analysis": {
        "sentiment": "string (positive, negative, neutral)",
        "confidence_score": "number (0-1)"
        # Include additional fields as required
    },
    "text_summarization": {
        "summary": "string",
        "key_points": "array of strings",
        "length": "number (number of words in summary)"
        # Include additional fields as required
    }
    # Add more JSON schemas for other analysis types
}

# Template for Generating System Prompts
STANDARD_PROMPT_TEMPLATE = "You are a data analysis assistant capable of {analysis_type} analysis. {specific_instruction} Respond with your analysis in JSON format. The JSON schema should include '{json_schema}'."

# Function to Generate System Prompts
def get_system_prompt(analysis_type: str) -> str:
    # Fetch the specific instruction and JSON schema for the given analysis type
    specific_instruction = ANALYSIS_TYPES.get(analysis_type, "Perform the analysis as per the specified type.")
    json_schema = JSON_SCHEMAS.get(analysis_type, {})

    # Format the JSON schema into a string representation
    json_schema_str = ', '.join([f"'{key}': {value}" for key, value in json_schema.items()])

    # Construct the system prompt with updated instruction
    return (f"You are a data analyst API capable of {analysis_type} analysis. "
            f"{specific_instruction} Please respond with your analysis directly in JSON format "
            f"(without using Markdown code blocks or any other formatting). "
            f"The JSON schema should include: {{{json_schema_str}}}.")        

In this script, ANALYSIS_TYPES holds descriptions for various analyses, JSON_SCHEMAS contains the structure for JSON responses, and get_system_prompt generates prompts for the AI model.

Analysis Types and JSON Schemas Samples

To illustrate how the JSON mode in the OpenAI API works, let’s delve into the ANALYSIS_TYPES and JSON_SCHEMAS, and examine the get_system_prompt function in detail.

1. ANALYSIS_TYPES Samples

ANALYSIS_TYPES is a dictionary mapping types of analysis to their descriptions. Here are a couple of examples:

  • Sentiment Analysis:Description: “Analyze the sentiment of the provided text. Determine whether the sentiment is positive, negative, or neutral and provide a confidence score.”
  • Text Summarization:Description: “Summarize the provided text into a concise version, capturing the key points and main ideas.”

2. JSON_SCHEMAS Samples

JSON_SCHEMAS outlines the expected JSON structure for each analysis type. Here are two examples corresponding to the above types:

  • Sentiment Analysis Schema:{ "sentiment": "string (positive, negative, neutral)", "confidence_score": "number (0-1)", "text_snippets": "array of strings (specific text portions contributing to sentiment)" }
  • Text Summarization Schema:{ "summary": "string", "key_points": "array of strings (main points summarized)", "length": "number (number of words in summary)" }

3. Function: get_system_prompt

The get_system_prompt function dynamically generates prompts based on the specified analysis type. It works as follows:

  1. Fetching Instructions and Schema:

  • Retrieves specific instructions and the JSON schema for the given analysis type from ANALYSIS_TYPES and JSON_SCHEMAS.

  1. Formatting JSON Schema:

  • Formats the retrieved JSON schema into a string representation.

  1. Constructing the Prompt:

  • Constructs a system prompt that includes the analysis type, specific instruction, and a request for a JSON-formatted response. It also specifies the structure the JSON should follow based on the json_schema_str.

JSON Mode: Assembling Code and Ensuring Consistency

In JSON mode, responses from the OpenAI model are structured as valid JSON objects. This mode ensures consistency in the following ways:

  • Structured Responses: Responses are in a consistent, parseable format, which is crucial for applications that process the model’s output programmatically.
  • Schema Adherence: By specifying the JSON schema in the prompt, the model’s responses adhere to a predefined structure, making it easier to integrate and use the data.
  • Clear Instructions: The prompts explicitly instruct the model to produce JSON, reducing the likelihood of receiving unstructured or irrelevant data.

This methodical approach ensures that the model’s output is not only consistent but also tailored to specific analytical needs, making it highly effective for diverse applications ranging from sentiment analysis to text summarization.

The JSON mode in OpenAI’s API offers structured and consistent output formats, beneficial for various applications, especially those requiring precise data handling and analysis. With the seed feature in beta, users can experiment with deterministic outputs, aiding in consistent application behavior.

See it in action:


Elliott A.

Senior System Reliability Engineer / Platform Engineer

10 个月

Excellent

回复
Paul S.

AI/ML Engineer | Advancing Generative AI

10 个月

Cool, I'm sure it can be useful. I'm curious to see how this performs in comparison to other models. Is there a bias (for you specific example, sentiment analysis? Accuracy, etc.

回复

要查看或添加评论,请登录

Cohen Reuven的更多文章

社区洞察

其他会员也浏览了