The Multimodal AI Tsunami: Is Your Network Ready?
The world of artificial intelligence (AI) is constantly evolving, and one of the most exciting recent developments is the rise of multimodal AI. Multimodal AI systems can understand and generate multiple types of data, such as text, images, audio, and video. This new frontier in AI holds great promise for improving the way we interact with technology.
Last week was an interesting one in the world of AI. There was enough suspense built in the prior weeks, and the much-anticipated announcement from OpenAI did not disappoint. On Monday, May 13th, OpenAI announced the release of GPT-4o, an AI personal assistant that can generate and interact in text, images, audio, and video. As demonstrated in the GPT-4o launch, this is a significant step forward in the world of AI and how it interacts with humans.
The very next day, May 14th, Google, in their annual developer conference, Google I/O, announced a whole range of features and products that leverage multimodality. One of the demos that stood out as a competitor to OpenAI's GPT-4o was Google Astra, another personal assistant capable of doing pretty much everything that OpenAI claimed on Monday. Google has been experimenting with multimodality for quite some time through their Gemini models.
?
?Another key player in the AI space, Inflection AI, introduced multimodality into Pi, their AI personal companion, by adding an audio interface that can engage in human-like conversations with the user, providing a voice call-like experience with the AI.
These are just a few examples, and as time progresses, we will continue to see more evolution in this area. We caught a glimpse of it through Microsoft Build, their annual developer conference, which detailed some high-impact use cases with multimodal AI models.
领英推荐
?How does this key development affect service providers?
?With large AI vendors promoting advanced use cases, more applications are starting to utilize these cutting-edge capabilities. As AI moves beyond the realm of text-based interactions, networks across the globe are bound to experience an increase in high-volume traffic, posing new demands for better-performing networks. Some key factors that will drive the adoption of this new era of applications are:
These interactions will inevitably place more demands on both upstream and downstream channels in internet service provider network. To visualize user behavior, imagine users of AI assistants being on all-day-long voice and video calls with AI as they go about their day-to-day activities. Unlike human participants in a WhatsApp call with limited time and patience, AI assistants have an infinite capacity to stay connected to the user. From personal experience, most interactions with AI over audio or video have been at least an hour long, especially because of how engaging some of these AIs can be. And Pi stands out in this engagement due to its focus on the emotional quotient. These interactions will likely increase the demand for higher network performance, especially in terms of lower latency and packet loss, along with the ability to handle more volume of traffic.
The first step for service providers preparing for this new era of applications, which will soon emerge among their users, is to be equipped with the ability to identify AI traffic, classify the type of traffic content (audio/files/video/text) being exchanged with various AI systems, and evaluate network KPIs that provide insights into network performance under these conditions.
It’s not just exciting to utilize the immense potential of multimodal LLMs, but also to have the opportunity to work on innovative solutions that cater to the needs of the evolving ecosystem supporting seamless interactions with AI. At Sandvine, my team's key focus on classification solutions of the future allows us to build and support this ecosystem. AppLogic, Sandvine's AI-powered classification engine, will play a crucial role in providing networks across the globe with the visibility and data necessary to be ready for the new Coming Wave. This will help kickstart and accelerate the journey for service providers.
Product Management & Business Development Leader | Data Analytics & AI Solutions Pioneer | Exceptional Driver of Growth in New Ventures
10 个月Very nice article. Congrats and great work as usual!
Principal @ Red Hat
10 个月Very nice blog?Mridula Madhusudan
Customer Support and Success Leader | Builder of High-Performing Teams | Growth Driver | Cross-Functional Expertise in Product and Engineering Alignment
10 个月This is an excellent overview Mridula Madhusudan. Nice work!