From a Whisper to a Chat(GPT)

From a Whisper to a Chat(GPT)

Hey there, it's a great day for tech enthusiasts! OpenAI just dropped two new APIs that are set to change the game when it comes to natural language processing (NLP). The ChatGPT and Whisper APIs are the latest developments from OpenAI, and they're bound to make integrating NLP into software a breeze.

First off, let's talk about the ChatGPT API. This language model is designed to engage in realistic conversations with humans. With its in-depth understanding of human language, ChatGPT can be integrated into chatbots, virtual assistants, and other conversational interfaces. What's exciting about this API is that it's trained on vast amounts of human-generated text, making it highly accurate and adaptable. It's expected to help developers create chatbots that can offer excellent customer support, answer user queries, and even crack jokes!

Ok. As it turns out, the above two paragraphs were generated by ChatGPT. They were taken verbatim, no revisions were made:

No alt text provided for this image

When you read the start of this newsletter you may or may not have thought it sounded like me. It does, mostly. ChatGPT took some of my prompts that I've written over the last two months, and some of the text I had it rewrite, and it got a "feel" for what I sound like.

Now this is not really anything new if you have been keeping an eye on ChatGPT and all that is going around about it. Pros and cons abound on this stuff. But no matter what you think about it, you know in your heart of hearts this is cool stuff. Scary cool or just super cool. Fascinating stuff.

ChatGPT API? GPT-3.5 API?

So what is the big deal about an API coming out for ChatGPT if we have the tool at our fingertips? Well, an API is an Application Programming Interface, a way to connect to a program without screens. You write code in your app that connects to the API to get data, to push data to a database, etc.

"But I thought there was already an API for this? What is the big deal?" Great questions! What we devs had was the GPT-3.5 API. This is the text completion engine that sits inside the ChatGPT application.

So ChatGPT is a chat application built on GPT-3.5. Get it?

Chat...GPT engine...ChatGPT.

GPT-3.5 is version 3.5 of the GPT engine. Blah blah blah...what you need to know is that the API for it is not chat-based. GPT is a text completion engine. You give it something, it completes the thought. You say "Why is the sky blue?" and it will give you an answer, thus completing the thought.

Guru Example

Look at it like taking your question up the mountain to the guru meditating there. You ask your question, you get an answer. "How many licks does it take to get to the center of a Tootsie Pop?" You are old if you know not only the question, but the answer.

You don't get to ask more than one question. So you go back down the mountain. If you want him/her to reword their answer, you have to go back up and ask it again with more specifics in the question about what you want for an answer.

You can't say "Hi guru, it is me again. Can you reword your answer?" They will look at you and go "Who are you and what answer are you talking about?"

It is very Q&A oriented. Microsoft Research built ChatGPT to be a conversational web app. It remembers the questions you ask. It knows what you've talked about previously. Thus, the concept of a "chat". You speak, I listen. Then I reply, you absorb and ask another question. That sort of thing.

You can ask the guru a question, get an answer and then say only "Can you rephrase that?" and it knows what you are talking about. If you go down the mountain and come back 2 days later and ask the guru to rephrase the answer, they will know who you are, what you asked, what you talked about, etc.

Until March 1, 2023, apps we created were using the one-off method. Every question stands alone.

Now, using the ChatGPT API we can basically have ChatGPT running inside our apps. Instead of needing to have ALL the context in every prompt, you can just add the new part like you would talking to the guru. (Geeks reading this - I know this is a broad brush explanation that isn't 100% accurate. I do it for ease of understanding by non-geeks. Bear with me.)

This is huge for devs who can now program real conversational apps using AI, without having to reword the whole question somehow using the old single-use Q&A methods.

Now we can do real AI chat. And generate lead magnets, and then tell it to regenerate just page 7 (and it will know what you are talking about). Everything will be in context as the conversation goes on, so the answers will be smarter, better, faster and cheaper. As compared to GPT-3.5, ChatGPT API calls are 1/10 of the cost!

Whisper

So what is Whisper? Whisper is an AI-based speech-to-text engine made by OpenAI. Check this out - you can upload a 25MB audio file, and it will do a transcription or a translation of the audio for you. Sure, speech-to-text has been around a while, but you've all seen the fun our phones and softwares have had understanding us. It is far from perfect.

But here is where OpenAI does its magic of scouring tons of real data to use in its AI models to be more accurate than anything that has gone before. Here is what they say, far better than what I ever could:

No alt text provided for this image
https://openai.com/research/whisper

680 THOUSAND hours of data collected from the web. With all our ums, ahhs, y'alls, ay yuhs...and that is just English. This is a cool API that will let us upload files and get a transcription on the fly for it.

Ok...now for some meat for you to chew on. This newsletter IS about innovation, right? These APIs are tools, that's it. What YOU build is the innovation.

1) You have an app that takes in your voice, asking for something. Sort of like Alexa or hey Google or whatever. But instead of it being some big deal like those, it is YOUR app. Yours. You wrote it. It takes in the speech, calls to the Whisper API and gets an immediate transcription of it. That transcription is put inside a ChatGPT prompt that you then use in a call to the ChatGPT API. The result you get back is then shown on the screen. The user then uses their voice and says "Now make it, you know, more like...um...more like Snoop Dogg would say it." And that goes over to Whisper, then to ChatGPT who returns back the text just like Snoop would say it, fo shizzle!

2) You are in a foreign country and do not understand what someone is saying. You hold up your app, get their voice in their own language, send it to Whisper which then gives you back a transcription in English.

3) There are apps that take in several pictures of you from different angles and with different facial expressions to generate a 3D realistic avatar of you. Now put on a VR headset, and speak some question. It goes to Whisper, then to ChatGPT and then the result is put into speech synthesis software and is spoken back to you by your avatar of you. With near perfect deep fake movements. ChatGPT becomes the answer bot. And then you say "Can you rephrase that, you handsome devil?" and it will.

These are the beginnings of what these APIs can do for us. Now I know, you have concerns and big scary monsters under the bed. Welcome to change and innovation of historic proportions! Don't let fear stop the GOOD this can do in the world.

Tackle the scary things as they come up, not sooner, and know that people much smarter than I will be working on those things.

For now, dream. This is the whisper...the beginnings. Fo shizzle.

Mark Douglas, AIC, SCLA

???? Online Marketing | ?? Digital Marketing. ?? Affiliate Marketing | I'm a Sales And Marketing Geek And I Can Help You Grow And Monetize Your Brand Using Social Media

2 年

Loved your article and explanation... I love seeing you lead the conversation around ChatGpt and it's practical uses... Just one thing... About that first section... I think ChatGpt is funnier than you ??...

Nicholas Soldo

General Contractor & Fatherpreneur?? ??

2 年

Awesome Greg Howe! Love the way you came out of the gate. Fooled me in beginning! I need to learn more about this API integration and how to prompt it correctly.

要查看或添加评论,请登录

Greg Howe的更多文章

  • Understanding the Microsoft and OpenAI Relationship

    Understanding the Microsoft and OpenAI Relationship

    Disclaimer - this is all educated opinion based on observations and decades of experience working with Microsoft and…

    1 条评论
  • "The Puck is Teleporting"

    "The Puck is Teleporting"

    Ever feel like AI is moving too fast for you to keep up? (If you say "Naaa, I got this" I will call you a liar..

  • Functionality Out of a Data Tapestry

    Functionality Out of a Data Tapestry

    What comes next in automations and software is crafting more AI-based functionality in a simpler model. Think LEGOs.

  • Build It and AI Will Come

    Build It and AI Will Come

    Truly fascinating AI news came up with all the push Microsoft does annually around its Microsoft Build conference. As a…

  • It's Not "Us vs Them"

    It's Not "Us vs Them"

    Today I replied to a post in the "ChatHeads" Facebook group, that was talking about Eliezer Yudkowsky's time.com…

  • Innovation Through Connectivity

    Innovation Through Connectivity

    You Build The Context One of the cool places I see ChatGPT and other AI tools heading, that will catapult us into even…

  • Spending All Day In Your (Em)bed

    Spending All Day In Your (Em)bed

    Spending All Day In Bed Kind of gives you a nice cozy feeling, doesn't it? Remember when you could spend all day in bed…

  • Is AI Replacing Me?

    Is AI Replacing Me?

    ChatGPT ChatGPT ChatGPT ChatGPT ChatGPT ChatGPT ChatGPT ChatGPT and AI ChatGPT Sucks! It's ridiculous. It's cool.

  • Onshore Winds From the Southwest

    Onshore Winds From the Southwest

    It is worse than you think. Southwest Airlines has just had a meldown of epic proportion that affected thousands of…

    2 条评论
  • Conveyor Belt of Innovation

    Conveyor Belt of Innovation

    When I talk to people about innovating in software, they often jump to the "end" and ask how I will run a big company…

社区洞察

其他会员也浏览了