Mind-blowing new OpenAI API Features: OpenAI Developer Day

Mind-blowing new OpenAI API Features: OpenAI Developer Day

Let me start by saying Sam Altman has just announced the end of the road of maybe 90% of AI startups out there, good news for some of us but the end for others. ChatGPT Plugins are finished, those Chat2PDF or Build ChatGPT with your own data apps are dead in the water. Many more in computer vision, text to speech apps etc.

However, I am still very excited and would like to share with you my reaction to the new updates. I was still working around 00:01—when a notification pinged from OpenAI. It was an email brimming with updates about the enhanced API. I nearly fell off the edge of my seat as I scrolled through the announcement, each bullet point more thrilling than the last.

Let us go through the updates.

Cost cost cost

Its wonderful news that they cost of the API, specifically GPT4-Turbo (New version) is going to be considerably cheaper.

Longer Content Length

GPT4-Turbo can now handle 128k token length wow - this is about 300 pages of a book and will go a long way in improving the model interaction and user experience.

JSON Mode and Reproducible Outputs

Yesss, finally - this was such a problem and usually meant we as developers had to do some prompt gymnastics to get consistent output. When building apps, you need consistency and predictability to avoid errors.

Multimodal Capabilities

Visual Input - the API can now ingest visual inputs. So imagine, now uploading a picture of a receipt and we add that to your accounting software as an expense or uploading a picture of a plant and get all the information you need about it.

TTS - Finally, we have a TTS API. We have had Whisper for a while, but no way to connect back with text to voice. I mean an OpenAI API text-to-speech with affordable cost per use for top of the industry quality model.

Image Output - this has been around, and I even switched to StableDiffusion on my application due to the bad quality of DALLE2 - keen to check out DALLE3, what I love already is the ability to get different image dimensions with a prompt.

Agents aka Assistants

I tried to manually build an agent, in MobileGPT there is the Learning Assistant that tries to simulate a learning agent. We got close enough, but with these outputs and improved function calling, json setting, reproducible outputs - we will reduce error rates considerably and create even better performing assistants.

That is not all, assistants now include built in:

  • Code interpreter
  • Retrieval
  • Function Calling (Improved)

This means you can ingest data (documents, text, code, etc) - run code in the call, return visualisations, more documents and even code. The possibilities are endless. I cannot wait to start playing with this in the API.

Conversation State

With the new Assistant functionality comes Threads that are stored within OpenAI, meaning you can manage conversation state within OpenAI itself instead of on your application.

This is great if you do not want to manage your own database, and useful for Assistant functionalities - meaning you do not need to resend the entire conversation with each API call, which will lead to lower overall costs due to the reduced tokens per call .

GPTs - Custom GPTs and GPTs store

The ChatGPT plugins functionality needed an upgrade, this change makes perfect sense. A GPT store for your favourite custom GPT. I will definitely have a few GPTs in there, loving the spirit from OpenAI team and Sam Altman.


OpenAI has propelled AI advancement light-years forward by embracing one fundamental principle: the open sharing of knowledge. Consider the realm of Computer Vision; how many fledgling companies could realistically aspire to tackle such a complex field? And what about the immense resources required, both in terms of cutting-edge GPUs and significant financial investment each month?

Now anyone can access state of the art computer vision model with a single API call for a few cents, this is amazing and a big kudos to the OpenAI team.

Follow my YouTube Channel for actual code and testing of these new features to be uploaded soon: https://www.youtube.com/c/skoloonline

要查看或添加评论,请登录

社区洞察

其他会员也浏览了