Multimodal AI goes Mainstream
Renato Azevedo Sant Anna
Product Marketing - Digital Innovation & Insights Specialist | GenAI Strategy & Digital Transformation | Strategic Positioning, Tech Writing & Foresight for B2B Marketing | Mentor at FasterCapital | AI Blogger | Speaker
Today, we got the great announcement of the version 4 of GPT . It's very different from other versions due the fact that it allows more types of inputs, not limiting to only text, but also image. Future Multimodal AIs may also have input of sources such as video and audio. This ability to work with different varieties of data as input that defines a "multimodal" approach.
It incorporates also a much bigger dataset and a more recent version of the data that allows more accuracy and dealing with more complex requests.
As cited in their website, they have spent 6 months refining this new version in order to generate better answers from users' requests.
In practice, in my view, it represents an "aha moment", in which a new paradigm of technology adoption is incorporated into our daily lives and AI becomes a virtual assistant to enhance our work and a way to explore our curiosity and imagination in a very different way.
In my view, we are going to go through an adaptation process regarding this new technology and it will serve as a way to leverage our skills and in what I usually call "Human + Tech" approach, in which we will find different and curious ways to allow our imagination to generate newer and singular combinations of ideas that will evolve due the feed-forward process and the feedback from experience to help shape the innovation process in organizations worldwide in the future ahead.