Will Multimodal GenAI Be a Gamechanger for Industry?

Will Multimodal GenAI Be a Gamechanger for Industry?

Google’s launch of Gemini can be seen as the latest advancement in generative AI , highlighting a shift toward multimodality.

At launch, ChatGPT (GPT3.5) revolutionized content production, and subsequent large multimodal models (LMMs) like GPT4 and Gemini have the potential to revolutionize sectors such as manufacturing, e-commerce, and agriculture.??

These new LMMs are trained on images and code, rather than on text alone. Gemini adds audio and video, allowing the AI to directly perceive the physical world.?

The race is on among tech companies and open source communities to add new modalities that enhance LMMs’ industrial applications.?

The So What

Such multimodal capability will be transformational for industry, says Leonid Zhukov, Ph.D , director of the BCG Global AI Institute.?

Traditional AI is constrained by preset rules—users decide what they want the AI to do and train it for that task. While GenAI models break free from this constraint, LMMs go even further. They can take in so many forms of data that they could respond to seemingly unlimited situations in the physical world, including those that users can’t predict, Zhukov explains.?

Companies’ current 10-20% efficiency gains from GenAI bots could expand into new domains with LMMs, he says.?

And this is just the beginning. “Today’s LMMs can see and hear the world. Tomorrow they could also be trained on digital signals from equipment, IoT sensors, or customer transaction data—to create a complete picture of your enterprise’s health on its own, without explicit instruction,” Zhukov says.??

Here are just a few potential industrial applications:??

  • Predictive maintenance and plant optimization. Instead of simply flagging known fault points, LMMs could take in video, sounds, and vibrations throughout the production line—independently monitoring for subtle changes and identifying unexpected signs of deterioration.?
  • Digesting visual data to drive understanding. At a sorting plant, algorithms can already be tasked with detecting individual items, such as plastic bottles for recycling. LMMs could independently see and analyze all waste, filter large mixes of objects, and identify unpredicted items.?
  • Medical advances. LMMs could improve the accuracy of AI models that analyze scans such as MRI, CT, and X-rays by layering in sound data such as heart beats, and then use natural language to engage with the doctors on personalized treatment plans. ?
  • Accessible shopping experiences.?LMMs could convert data from a retailer’s physical and digital presence into the best source of real-time information for a customer’s needs—for instance, visual or auditory support—providing a more inclusive shopping experience.?

Now What

Firms need to prepare to integrate multimodal models. According to Zhukov, leaders should:?

  • Drastically revisit your data strategy and operations. LMMs promise to deliver enormous value from underutilized (or uncollected) data. This is significant because, according to a study by Seagate, companies are currently underutilizing up to 70% of data they collect. Companies also need to make sure the data has the right features, for example time stamps, to be fed into the models.?
  • Decide whether to build or partner. AI services will likely evolve from a few large models toward many smaller industrial ones. And unlike pure text models, multimodal models are unlikely to offer out of the box solutions right away, because industrial data is not publicly available. Some large industry players may choose to build their own models and offer them as a service for others; smaller firms will need to find the right partners. That choice will determine the type of training and hiring needed to support and integrate the models.?
  • Monitor GenAI’s jagged frontier. LMMs have the potential to become the brains of autonomous agents—which don’t just sense but also act on their environment—in the next 3 to 5 years. This could pave the way for fully automated workflows, Zhukov believes.?


For Further Reading:??

How People Can Create—and Destroy—Value with Generative AI ??

Turning GenAI Magic into Business Process ??

GPT Was Just the Beginning. Here Come Autonomous Agents ??

Or visit BCG X , the tech build and design unit of BCG.

Eugene B.

Program Leader: Critical Data Security | Digital Transformation | AI Data Readiness | Ethical AI Governance

10 个月

As Leaders take these new capabilities to market, it will be very important to carefully and narrowly tailor the Use Cases, Feature-Function-Benefit mix and duration of projects to deliver value in a predictable time frame. Even if the iterative nature of the learning curve along the "jagged edge" of this technology makes the true value of these solutions is, as yet, unpredictable, enterprise CXO's and their leadership will need very careful guidance that is attuned to their operational and financial constraints.

回复
Amit Mirashi

Founder - Ghumozone Holidays India | villasbunglows.in. | DHL AND BLUEDART Courier

10 个月

Call for us for Corporate conferences/ Annual event/ sales meet / distribution ceremony etc 91 9892771230 or 9820838684 [email protected] or [email protected] www.ghumozone.com

  • 该图片无替代文字
回复
Lakhan M

Digital Marketing Specialist

11 个月

A Strategic Guide to Product Modernizing with GenAI Get Your Copy: https://bit.ly/3NhxAjp, #genai #generativeai #generative #artificialintelligence #ai #aitechnology #generativeaitools #generativeartificialintelligence #generativemodels #technologysolutions #productdesign #productdevelopment #productinnovation

要查看或添加评论,请登录

波士顿谘询公司的更多文章

社区洞察

其他会员也浏览了