ChatGPT is blind. Hear me out.
Have you ever compared ChatGPT's input mechanisms with the human senses? Or maybe the other way around?
The idea is simple - just like human senses give our brain inputs to process, ChatGPT is given inputs from us. We are ChatGPT's senses towards our world.
While not exactly capable of a human brain's capacity to co-relate massive amount of diverse data and process it quickly, it has specialized in one particular type of activity - reading. As of now, ChatGPT is able to read a lot, listen a lot, and also speak a lot. It essentially has gained one of the most important senses humans possess - the ears. I don't know about you, but I have gotten addicted to using ChatGPT's Voice-to-text feature, so I surely feel like I am having a conversation with a real assistant.
Now, although not exactly a sense, but talking can also be added to chatGPT by simply converting its text responses into voice responses. Billions of dollars worth of work is already in progress to make it extremely accurate. So, we have also (almost) given chatGPT a tongue. Albeit, it lacks the ability to taste.
Or smell.
Or touch.
Or... See.
A lot of work is underway (I believe) to give robots these senses on an individual product level. However, our focus here is about the soft, intellectual outputs (because they scale much faster), not the hard, physical ones.
There are already machines that can detect the taste of a food item. Similarly, there are tools that can detect the particles in the air that help them smell it. Sensors that can detect pressure and sense touch have been around for a while - you might be holding an object entirely made out of this. And lastly, it has been more than a century since we gave the artificial world an eye - the camera.
So, why am I focusing specifically about chatGPT being blind? Because 1. this is revolutionary (seeing for an AI is a lot more important than any other senses from a practical PoV to do various tasks), and 2. this is something we are the closest to achieving - the ability to process images and merging it with generative responses.
ChatGPT is currently blind. But it is slowly gaining its eyes through an age-old invention. Soon, it will be able to see the charts and graphs along with the excel sheets you are uploading, or take a look at your photos to give you fashion tips, or see the image of a crime scene to highlight any clues in a few seconds!
And ChatGPT is only one player. We have many more in this race to greatness! (I am just as excited about Bard and Adobe Firefly)!
**
We are underestimating how massively we have grown as a species over the last two years. From developing a vaccine in record time - 5x faster than ever before - to revolutionizing the world through blockchain, AR/VR and web3! We are underestimating the era we have just entered!
The next big phase is not too far. Just wait till Large Language Models (LLMs) are merged with AI-based image processors / computer vision. It is going to change consumer economy (and humanity) forever!
Retired journalism Instructor at Green River College
10 个月You may mean “hear” me out, young man. ??