Beyond the chatbot
VentureBeat
VB is obsessed with transformative technology — including exhaustive coverage of AI and the gaming industry.
The generative AI era is approaching its second anniversary. November 2024 will mark two years since OpenAI first launched ChatGPT on the web, ushering in a hype cycle that has scarcely abated since then.
But in that time, ChatGPT's interface has remained more or less the same: a long rectangle at the bottom where the user can enter in text prompts and a pane above it where they can see their messages to the AI model and its responses in return, similar to a texting application.
Sure, OpenAI has added suggested prompts and lots of cool features tucked away in various menus, but the overarching experience for users is more or less as it was: type and get responses above.
That has started changing though. After a fracas with actor Scarlett Johansson and a long delay, OpenAI finally shipped its ChatGPT Advanced Voice Mode to users at the end of last month, allowing them to speak with the model, interrupt it, and carry on long back-and-forth conversations. It's a perfect companion on your phone when you're walking around outside and want help identifying the type of clouds above you, or simply answering your child's questions about whether Santa Claus and his elves are real.
Others report that it works well for ideating, dictation, learning new languages, and much more.
Over the summer of 2024, OpenAI also launched its first stand alone search product, SearchGPT, which provides more information from the web in its responses and interactive elements such as weather forecasts and imagery from source sites. It only remains available to those on a (since closed) waitlist, but clearly shows that OpenAI is thinking beyond its standard chatbot interface.
And just this week, the company launched Canvas, a new sidebar-style interface that users can activate within ChatGPT itself and displays content such as a text document or software project to the right of the screen — with the underlying GPT-4o large language model (LLM) updating only selected portions of the output on the right, and the user able to highlight and use a variety of different, dynamic tools besides text available in a pop-over menu.
While clearly analogous to Anthropic's Claude Artifacts view, OpenAI's implementation of the idea is unique and shows that is continue to iterating and experimenting within its signature product. It's unafraid to change things up to offer a potentially better experience for users.
Which raises the larger question: what is the proper interface for users to interact with LLMs? Is it a chatbot, one with a workspace attached, or something else, like search, that changes depending on context?
That's the idea of Karina Nguyen, an AI engineer and researcher at OpenAI who worked on Canvas, and took to X to note it was "the first time we are fundamentally changing how humans can collaborate with ChatGPT since it launched two years ago," and in another post, explained where it might be headed next:
"My vision for the ultimate AGI interface is a blank canvas. The one that evolves, self-morphs over time with human preferences and invents novel ways of interacting with humans, redefining our relationship with AI technology and the entire Internet."
领英推荐
Two years into the gen AI era, the question remains open-ended. What do you think? Let us know in the replies.
Thanks for reading, writing in, subscribing, sharing, liking, and just being you. Happy weekend and shana tova to those celebrating Rosh Hashanah.
Read More