It’s time to be humAIn
Amit Adarkar
CEO @ Ipsos in India | Author of Amazon Bestseller 'Nonlinear' | Blogger
Any profession usually comes with a professional hazard. If you are a surgeon or a gynaecologist, you could get called to your hospital or clinic any time of the day. If you are in a sales or field job, you have to be out on the road irrespective of the weather. And, if you are in a customer service function, you may have to listen to irate customers all the time. How about market research? It is difficult to imagine any significant professional hazard associated with it. I beg to differ. As a market researcher, I feel compelled to answer all surveys I get- from airlines, restaurants, hotels, eCommerce sites – you name it. Firstly- to celebrate my profession (if I don’t take surveys, how could I expect other people to take surveys!). Secondly- to assess questionnaires used by various companies and also to see the quality of interviewing when I am being asked questions (sometimes, a tele caller masquerading as a market researcher will ask- on a 10 point scale, would you rate us 9 or 10. Arrrgh…). As a market researcher, I also fall prey to listening to tele callers, just so that I get to learn about new products and offers. ?
I got disappointed when companies started replacing human tele callers with voice bots with IVR or voice recognition. These voice bots speak in a monotonous emotionless way. Boring! It is for the same reason that I was never fascinated with Alexa or Siri. These bots just don’t sound human.
What do I mean by ‘don’t sound human’? Here are some thoughts. imagine the last time you spoke to your spouse, friend or office colleague. Won’t you agree that your conversation was ?peppered with:
Any chat bot or voice bot or even LLM based Generative AI could not do this earlier. When one uses ChatGPT, you get text as output. When one uses Midjourney, you get images as output. And when one uses MusicLM, you get music as output.
领英推荐
This changed just last week, when OpenAI announced a new model called GPT-4o. The ‘o’ stands for omni. Unlike any other Gen AI model, GPT-4o understand voice, text, images and it can respond with voice, text and images. It is primarily meant to be voice-based, unlike earlier models which were more dependent on text that you have to type. You can interrupt GPT-4o midway and change track of the discussion. You could also ask the model to change the tonality (Hi GPT, can you speak with more emotions?). The model understands and responds in more than 50 languages (real time translation is possible now). It can read data charts as well as coding programs and it sounds very much human.
Personally, I am much more excited with GPT-4o than I was when ChatGPT was launched 1 ? years ago. Here is a model that is almost human-like when you talk to it. In my opinion, till GPT-4o came long, AI was something external to humans and “Human + AI” would have been the best way to describe how AI can help humans.
With GPT-4o, AI has come much closer to humans, and “HumAIn” would be a better way to describe how close AI could now be, to humans.
In my recently published book #Nonlinear, I talked about an upcoming Digital Renaissance. With GPT-4o, it feels as if we are almost there.
Something to think about..