OpenAI DevDay: Unleashes GPT-4 Turbo.  Kills traditional GenAI RAG Solutions

OpenAI DevDay: Unleashes GPT-4 Turbo. Kills traditional GenAI RAG Solutions

OpenAI hosted its inaugural developer event on Monday, and it made quite an impact. The company unveiled upgraded models and new APIs. In this article, I'll provide a summary of the noteworthy features, how they relate to existing offerings that business and tech leaders should be aware of in this rapidly evolving GenAI landscape.

Highlights

New GPT-4 Turbo released with impressive features:

  • Knowledge of world events up to April 2023.
  • 128k context window (About 300 pages of text in a single prompt).
  • 3x cheaper?price for input tokens and a 2x cheaper price for output tokens.

New GPT-3.5 Turbo released with enhanced features:

  • 16K context window (about 24 pages).
  • Improved instruction following and parallel function calling.

Enhanced Function calls.

Reproducible outputs.

New “Assistant API” (a game changer).

Custom Models - Selected organizations with extremely large proprietary data (billions of tokens) can work with OpenAI researchers to train custom GPT-4 models to their specific domain and keep it private.

Copyright Shield to defend and cover legal costs for customers facing copyright infringement claims.


Some of these features deserve a closer look to understand their game-changing potential.

Let's explore a few of them in more detail:

1.?Function Calls

What? Connects OpenAI models to your internal or external APIs and data sources.

Why? ?You can leverage the power of LLMs and your deep domain knowledge to:

  • Provide a personalized user experience.
  • Review / Generate up-to date internal content.

The Old:

  • Was only available with Chat Completion API - the programmatic way to build your own Chatbot.
  • Can only detect the need for a function call and generates function inputs, based on the inputs - user query and list of available functions.
  • Does not automatically invoke a function, instead only provides the inputs values from user query and the appropriate function from the provided function list.
  • Does not keep track of chat conversations. Developers were required to build this feature.
  • Needs a RAG (Retrieve Augment Generate) architecture to act on your private knowledge base.

New:

  • Now available with the new “Assistants API” - a formidable tool to unleash the full power of GPT Turbo models for your business.
  • Enhanced capability to detect the need for a function call and robust generation of function inputs.
  • Supports Parallel function calls eliminating the need for multiple round trips to the LLM. For example, you can enable customers to find the best prices for a product across various locations or enterprise systems in one go.


2.?Reproducible Outputs

What? Allows higher degree of control over the model behavior ensuring consistent completions by the LLM (mostly).

Why? Consistency and reliability are crucial for customer-facing use cases that rely on your internal knowledge. You wouldn't want your chatbot to provide different variations of the same response with each interaction.

The Old:

Model output is mostly random (non-deterministic). Dependable "Repeatable output" (deterministic) was a distant dream. The popular ways to maximize deterministic outputs are to adjust the Temperature and Top-K/Top-P settings yet falls short for many use cases that need consistent output.

The New:

  • Introduction of “Seed” parameter controls model behavior, when other settings remain constant.

To control the output, set a consistent integer value as the seed parameter across requests (e.g., 12345), and maintain uniform values for all other parameters (prompt, temperature, top_p, etc.) across requests, and monitor the system_fingerprint field in the response to track model and configuration changes.

?

3. Assistants API

What? An assistant is a purpose-built AI that follows specific instructions, utilizes additional knowledge (your proprietary data), and can invoke models and tools to perform tasks (send emails, post in social media etc.).

Why? This feature enables businesses to create custom agents with distinct personalities and enhanced capabilities to meet their specific needs.

The Old:

Build your own agents to assemble required tasks to complete a unit of work.

New:

  • The Assistant API offers out-of-the-box automation for most tasks, particularly related to proprietary knowledge retrieval.
  • Introduction of “persistent and infinitely long threads”. This allows developers to delegate thread state management to OpenAI and overcome context window limitations.
  • Supports three types of tools today: Code Interpreter, Retrieval, and Function Calling.

  1. Code Interpreter: Writes and executes Python code in a sandboxed environment, enabling the generation of graphs, charts, and the processing of files with diverse data and formatting
  2. Retrieval: Enhances your assistant's knowledge by incorporating external information, such as proprietary domain data and product information, provided by your users.
  3. Function Calling: This feature empowers assistants to invoke functions that you define and incorporate the function's response in their messages.

Retrieval feature eliminates the need for complex computations and storage of document embeddings, as well as the implementation of chunking and search algorithms. The Assistants API optimizes the choice of retrieval technique based on OpenAI's experience in building knowledge retrieval in ChatGPT. It requires gpt-3.5-turbo-1106?and?gpt-4-1106-preview?models.

Overall, this is a formidable feature that has the potential to render RAG architecture (Extraction, Chunking, Embedding, Indexing, Relevant retrieval, Context size management, Conversation State management) obsolete.


I am excited to experiment with GPT 4 Turbo.

要查看或添加评论,请登录

Siva Appavoo的更多文章

  • Speed Limits for AI: Accelerating Safely into the Future

    Speed Limits for AI: Accelerating Safely into the Future

    In driving, as in life, the allure of speed can be both tempting and deceptive. Rory Sutherland’s analogy about speed…

    1 条评论
  • ChatGPT Primer for Legal Leaders

    ChatGPT Primer for Legal Leaders

    Unless you live under a rock, you must have been bombarded with the raging hype around ChatGPT, the AI technology…

社区洞察