AI Observability | Monitoring Generative AI's New Capabilities
Generative AI models, capable of creating novel content in image, video, and text, are reshaping the landscape of technology and creativity. As these models rapidly evolve, monitoring their output presents challenges. This article identifies common concerns and offers inventive approaches, grounded in MLOps, for addressing the challenges of monitoring Generative AI.
New Capabilities and the MLOps Foundation
Generative AI presents exciting new opportunities for development and innovation. When and how we choose to apply these new capabilities is influenced by factors like accuracy, fluency, and risk - as highlighted by Barak Turovsky in his framework for evaluating Generative AI use cases . As the Generative AI application space develops, we must mature our monitoring and observability practices to incorporate generative models. Establishing a comprehensive AI Observability strategy - for both Predictive and Generative AI - becomes essential for any organization wishing to take advantage of these new capabilities.
MLOps emerged to streamline the machine learning model lifecycle including the monitoring of machine learning models. It initially addressed model deployment and performance metrics and expanded to include model ethics, governance, and other concerns in deploying production machine learning models.
When building a strategy for Generative AI, organizations need not discard existing MLOps investments.?Rather, they should update relevant MLOps methodologies, adopt new technology where appropriate, and continuously apply state of the art model evaluation methods by leveraging custom metrics as part of their generative model monitoring approach.
Leveraging and Updating Traditional MLOps for Generative AI
Generative AI presents common challenges to those we've encountered working with Predictive AI – for example, concerns around bias and fairness. Like Predictive AI, a biased Generative AI model can produce outputs that perpetuate stereotypes or misconceptions.
Practical tools and techniques for addressing AI bias and fairness remain relevant. Established frameworks we’ve used for years can be modified to evaluate new AI data types, including prompts and LLM outputs.
The New Frontier: Technologies Tailored for Generative AI
Beyond adapting established MLOps strategies, Generative AI invites the creation of bespoke tools to meet this extraordinary technology. For example, with photorealistic AI-generated images virtually indistinguishable from actual photos, Imagen and Google Cloud collaborated on SynthID , a digital watermarking tool for AI-generated images that embeds an imperceptible watermark within the pixels of an image.
领英推荐
SynthID not only provides attribution for content but also arms users against misinformation by clearly demarcating AI-generated media. It's worth noting that, while promising, SynthID only works with images generated by Imagen. However, SynthID sets a precedent for how similar tools might evolve for other AI models, including audio, video, and text modalities.
Custom Metrics: The Key to Evolving AI Observability
Generative AI outputs aren't just about accuracy. Enterprise-grade observability tools are now emphasizing Generative-AI-specific metrics like toxicity, truthfulness, and relevance. This advanced monitoring helps ensure that applications align with set parameters and remain within ethical boundaries. Ensuring the efficacy of Generative AI requires metrics that go beyond traditional data drift and accuracy, capturing the nuances of generated outputs and safeguarding against potential risks.
The state of the art for evaluating Generative AI is rapidly evolving. New, purpose-built, LLMs are actively being designed, trained, and refined to monitor Generative AI. It’s for this reason that open monitoring frameworks - ones that can accommodate custom metrics - are essential. Organizations need access to the latest and greatest evaluation models to effectively monitor their Generative AI models and develop trust in the quality of their outputs.?
DataRobot’s custom metrics, shown here, can host state of the art model evaluation techniques alongside deployed models for continuous AI Observability.??
As Generative AI models assume a more prominent role in technology and creativity, the imperative to monitor and manage their outputs grows stronger. Extending traditional MLOps strategies, innovations like SynthID, and the emphasis on custom metrics pave the way for a more accountable and transparent future in AI.
Addressing the unique challenges posed by Generative AI ensures not only a surge in creative potential but also the responsible use of this groundbreaking technology. As AI Observability matures, the goal remains clear: ushering in an era where AI is both revolutionary and reliable.
Scott Munson is VP of Data Science at Evolutio, leading a team of Data Scientists and Machine Learning Engineers focused on accelerating AI initiatives to deliver business value for Evolutio’s enterprise clients.
As a product leader, Scott led the delivery of Evolutio's AI Observability module integrating DataRobot onto Cisco’s FSO platform, an AI integration between DataRobot and SAP, and Evolutio’s Analytics Relay data observability product.
Scott presents on the impact of monitoring and combining Predictive and Generative AI including Webinars, GenAI Hands-On Workshops, and conferences including Nashville Analytics Summit, Google Cloud Next, and AI Summit New York.? ?
I help companies automate business processes using Intelligent automation
1 年That's a compelling read! Do you think the current pace of evolution in AI Observability is sufficient to meet the upcoming challenges?