Multimodal AI goes Mainstream

Today, we got the great announcement of the version 4 of GPT . It's very different from other versions due the fact that it allows more types of inputs, not limiting to only text, but also image. Future Multimodal AIs may also have input of sources such as video and audio. This ability to work with different varieties of data as input that defines a "multimodal" approach.

It incorporates also a much bigger dataset and a more recent version of the data that allows more accuracy and dealing with more complex requests.

As cited in their website, they have spent 6 months refining this new version in order to generate better answers from users' requests.

In practice, in my view, it represents an "aha moment", in which a new paradigm of technology adoption is incorporated into our daily lives and AI becomes a virtual assistant to enhance our work and a way to explore our curiosity and imagination in a very different way.

In my view, we are going to go through an adaptation process regarding this new technology and it will serve as a way to leverage our skills and in what I usually call "Human + Tech" approach, in which we will find different and curious ways to allow our imagination to generate newer and singular combinations of ideas that will evolve due the feed-forward process and the feedback from experience to help shape the innovation process in organizations worldwide in the future ahead.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了