Stop Babysitting ML Models
Travis Rehl
CTO & Head of Product // Pushing boundaries for SaaS and Startups // Cloud, Generative AI and more
Let's be honest. Labor is expensive, and business budgets are tightening.
Especially when it comes to managing, training, and quite frankly babysitting an ML model.
Every business wants to have clean and accurate ML models, so they employ staff to manage ML model results to check the accuracy of content. If it's accurate they click "yes", and that result continues to train the model. If it's not accurate, they click "no" and the results are scrubbed.
Despite the importance of this process, it's time consuming, mundane, and expensive.
As an enterprise, one of the greatest challenges faced while leveraging machine learning capabilities is the bottleneck caused by data labeling.
This cumbersome but vital task can be considerably slow and expensive. Translating into sluggish deployment timelines, which affect business deliverables and can impact the end customer experience.
Let's consider a security product designed to spot anomalous activities in an organization's system. The product’s effectiveness hinges on its ability to adapt to a specific customer environment. It needs to be trained on unique telemetry data pertaining to that customer’s operation over a specific period, an approach that involves extensive manual labeling of data points as 'normal' or 'anomalous.' This process could span weeks, if not months, before a custom model ready for deployment emerges, creating a significant lag that hampers customer satisfaction.
Alternatively, you can let a robot do it.
领英推荐
Instead of requiring staff to toil away at data labeling (classification), a #generativeai LLM solution such as OpenAI or Anthropic can be tuned to label anomaly data.
LLMs, are not only limited to natural language understanding tasks but can also be leveraged to enhance the efficiency and effectiveness of anomaly detection in machine learning models.
Here's how.
The methodology involves using an LLM to examine a mix of 'good' data points, anomalies, and the new data point in question. The model then determines whether the new data point should be labeled as an anomaly or not, based on its comparison and understanding of the 'good' data and anomalies.
User
You are an anomaly detection expert, analyze and classify the following logs,
providing a "Normal" or "Anomaly" answer.
<Application context>
The following web traffic and
load balancer logs are generated by an eCommerce solution hosted on AWS
</Application context>
Step 1. Review the following potential logs to be classified as an anomaly
<YOUR ANOMALY LOGS GO HERE>
Step 2. Compare the potential anomaly logs to these known Normal logs.
<YOUR KNOWN GOOD LOGS GO HERE>
Step 3. Knowing the context of the application and its purpose, classify
potential logs as "Normal" or "Anomaly"
Assistant:
<Normal or Anomaly>:
This approach nearly eliminates the need for human supervision in data labeling, which can now run 24/7/365.
Additionally, it can rapidly and cost-efficiently identify anomalies in data, streamlining the training process and making it possible to create custom models more quickly than traditional methods allow.
The implications of leveraging #generativeai for operational Model management are wide-ranging, from boosting customer satisfaction to driving down operational costs.
As we continue to unlock GenAI potential, we are paving the way for more efficient, customer-centric, and cost-effective solutions.
Permanent, Turnaround, or Interim CEO / CxO / Consultant
1 年Excellent idea!