OpenAI recently announced exciting updates like Realtime API, Prompt Caching, Model Distillation and Vision Model Fine-Tuning.
- Realtime API - Realtime API enables developers to build apps with real-time, speech-to-speech interactions. It is like ChatGPT’s Advanced Voice, but for your own application. Audio capabilities in the Realtime API are powered by the new GPT-4o model gpt-4o-realtime-preview. RealtimeAPI costs around $0.06 per minute of audio input and $0.24 per minute of audio output.
- Prompt Caching - Prompt Caching enables faster prompt processing times, reduces costs and latency by reusing recently seen input tokens. Cached prompts are offered at a discount compared to uncached prompts. Prompt Caching is automatically applied on the latest versions of GPT-4o, GPT-4o mini, o1-preview and o1-mini, as well as fine-tuned versions of those models.
- Model Distillation - This feature allows developers to fine-tune a cost-efficient model with the outputs of a large frontier model–all on the OpenAI platform. OpenAI offers 2M free training tokens per day on GPT-4o mini and 1M free training tokens per day on GPT-4o until October 31 to help developers get started with distillation.
- Vision Fine-Tuning - This feature allows developers to fine-tune GPT-4o with images and text to improve vision capabilities. OpenAI offers 1M training tokens per day for free through October 31, 2024 to fine-tune GPT-4o with images.
Liquid Foundation Models (LFMs) – a new generation of generative AI models that achieve state-of-the-art performance at every scale, while maintaining a smaller memory footprint and more efficient inference. This is the first time a non-GPT architecture significantly outperforms transformer-based models.
- Liquid Foundation Models (LFMs) – a new generation of generative AI models that achieve state-of-the-art performance at every scale, while maintaining a smaller memory footprint and more efficient inference.
- LFMs include 1.3B, 3.1B and 40.3B models. 1.3B model is ideal for highly resource-constrained environments, 3.1B model is optimized for edge deployment and 40.3B Mixture of Experts (MoE) model is for tackling more complex tasks.
- LFM-1B?achieves the highest scores across various benchmarks in the 1B category, making it the new state-of-the-art model at this size.
- LFM-3.1B is par with Phi-3.5-mini on multiple benchmarks, while being 18.4% smaller.
- LFM-40B? leverages 12B activated parameters at use and is comparable to models larger than itself, while its MoE architecture enables higher throughput and deployment on more cost-effective hardware.
Black Forest Labs recently announced FLUX1.1 [pro] which provides six times faster generation than its predecessor FLUX.1 [pro] while also improving image quality, prompt adherence, and diversity.
- Superior Speed and Efficiency: FLUX1.1 [pro] provides an ideal tradeoff between image quality and inference speed. FLUX1.1 [pro] is three times faster than the currently available FLUX.1 [pro].
- Improved Performance: FLUX1.1 [pro] surpasses all other models on the Artificial Analysis image arena leaderboard, achieving the highest overall Elo score.
OpenAI's Canvas is a new way of working with ChatGPT to write and code.
- Canvas opens in a separate window, allowing you and ChatGPT to collaborate on a project.
- Canvas was built with GPT-4o and can be manually selected in the model picker while in beta.
- Access to canvas feature is rolled out to ChatGPT Plus and Team users globally. Enterprise and Edu users will get access next week.