登录查看更多内容

AI at scale: Managing ML models over time & across use cases

Algolia

The leading provider of AI search solutions, serving over 17,000 businesses and 500,000 developers globally.

发布日期: 2023年9月7日

Just a few years ago it would have required considerable resources to build a new AI service from scratch. Of course, that’s all changed. Yet, this is just a very small first step: the actual challenge of running AI at scale is sustaining quality over time and variance.?

Managing the lifecycle of ML models over time and across use cases proves to be essential to the long term success of investments in AI. For specific tasks such as translating languages or answering questions, minimal knowledge of Python is all it takes to interact with powerful pre-trained ML models, easily found on repositories such as Hugging Face .?

Integrating such a model at the heart of an API, is also relatively easy. Running AI-powered services in production does not differ from running “conventional” services; it might be more CPU intensive than typical CRUD applications, still, serving a large amount of requests with acceptable latency similarly boils down to how many machines to use – hence money.

However, while it may be easy to get started, it’s much harder to maintain, optimize, and scale AI over time. Managing the lifecycle of machine learning models over time and across use cases is essential for long-term success.?

The challenges of AI over time

There are scores of new AI models — each more capable than the next — with more hidden layers, more parameters, and different architectures. Game-changing ML models appear regularly, and adjusting their architecture is trivial; in practice, new ML models appear every second. Not all of them are efficient or even relevant to every business use case, but some can significantly improve results. How can you know if a new model is better than a previous one? Deploying ML models and comparing their performance is crucial.?

As an additional complication, the performance of a given ML model is known to change over time: their predictive ability or classification power decay. The reasons for this decay, known as concept drift , are beyond the scope of this article. It can be conceptualized as a consequence of “global context” changes: new habits appearing, usage of words evolving, seasons changing, people’s preoccupations shifting. To adapt to that, existing ML models must be monitored over time and manually or continuously retrained before being redeployed and compared.

Besides, note that these considerations are true for any single “intent” and there are many of such intents in an application. For example, in the world of Search:

Properly trained, some ML models like Retina Net or YOLO , can label items of interest in images – therefore enabling textual search over a set of images;
Others, like BART for NLI, can measure the probability for a text to relate to specific labels – therefore enabling content categorization;

Last, business key performance indicators are far from unique, and their importance varies depending on the concrete use case. Continuing the example of search:

For some businesses, the conversion rate is the most important metric to increase;
For other businesses, the generated revenue is the one to optimize

领英推荐

This AI newsletter is all you need #101

Towards AI 6 个月前

How Knowledge Graphs Enhance LLM Application…

Data Science Dojo 1 个月前

Progress of Generative AI in 2023 - The Year of…

Data Science Dojo 11 个月前

Running AI at scale is accepting all these variables and navigating a multi-dimensional landscape.

Operating AI at scale at Algolia

At Algolia, we handle all of this complexity on behalf of our customers, so that they can focus on their core business and get meaningful outcomes. Each customer is unique: their audience, their content, their preferred business KPIs… everything varies from one customer to another. Running AI at scale means supporting this variability while continuing to introduce new ML models or refining existing models.?

We have also been developing proprietary models for years now to solve precise problems such as search personalization, query understanding and matching, and ranking. We also augment our pipelines with existing pre-trained models – for example, we started our semantic search efforts with the Universal Sentence Encoder suite. Today, Algolia NeuralSearch uses a combination of several ML models to solve very specific search intent for very different use cases, and we will continue to introduce new models to increase the power of our search.

In a way similar to how versions are tracked in production, we keep extensive track of the ML models being used over time. This means that we can understand which instances have which combination of models, therefore which customers are using which versions. As we leverage these models to build dedicated data structures, this tracking is also key to trigger the updating of these derived data (e.g. indices).

Perhaps the most important aspect to improving ML models over time is tracking how models are performing to help customers achieve their business KPIs. Algolia customers configure their search and recommendation pipelines with events — clicks, conversions, purchases, ratings, add-to-cart, and so forth — and events are key to the success of an implementation.?

When deploying new ML models, we first monitor their impact on these KPIs for a small but significant part of customers’ traffic, and for a significant amount of time. Depending on the customer, it may take a couple of weeks to confirm that a particular model is improving the relevance of their search experience.

What does it mean to monitor ML? Developers are familiar with how conventional software is monitored to find errors. Input and output are generally pretty clear as deterministic, and many errors can be detected and captured as test cases. On the other hand, ML models are non-deterministic by nature: they are expected to answer in ways that cannot be predicted . Identifying incorrect behavior and alerting accordingly is an extremely complex problem, which only AI experts knowing their models can solve appropriately.

NeuralSearch with Algolia

With Algolia NeuralSearch , customers can benefit from state-of-the-art AI based search, while benefiting from Algolia’s renowned performance, reliability and quality. All this complexity – from the selection of ML models, to their deployment, monitoring and management over time – is handled by Algolia.?

Learn more about the tradeoffs of buying vs building AI search from scratch, or sign up today to see how NeuralSearch can work for your use case.

AI at scale: Managing ML models over time & across use cases

Algolia

The leading provider of AI search solutions, serving over 17,000 businesses and 500,000 developers globally.

The challenges of AI over time

领英推荐

Operating AI at scale at Algolia

NeuralSearch with Algolia

Hashing It Out: AI Newsletter

10,822 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

LLMs for Simulated User Feedback, Causal AI, AI Slide Decks from ODSC East, and Low Code Time Series Analysis

??? GraphRAG Evolves into StructRAG

Issue #296 - The ML Engineer ??

Navigating the Future of AI: Insights from The AI Summit London 2024

GenAI Weekly — Edition 25

Democratizing AI: How Hugging Face & KNIME Make It Easier

Issue #208 - THE ML ENGINEER???

AI, Test Right

Chapter 3: LLM Lifecycle, Installing an LLM, LLM Ops

The challenges of AI over time

领英推荐

Operating AI at scale at Algolia

NeuralSearch with Algolia

Hashing It Out: AI Newsletter

10,822 位关注者

Ecommerce personalization platforms: a buyer’s guide

2024年11月20日

AI recommendations for your Shopify store

2024年11月11日

How retailers can turn Halloween and fall trends into treats, not tricks

2024年10月31日

When should you start considering an AI search solution?

2024年10月23日

What is retail analytics and how can it inform your data-driven ecommerce merchandising strategy?

2024年10月17日

How to responsibly give a chatbot access to a database

2024年10月9日

Content personalization: why it’s critical and how to get started

2024年10月3日

The pros and cons of AI language models

2024年9月26日

8 ways to use merchandising data to boost your online store ROI

2024年9月17日

The definitive guide to semantic search engines

2024年9月12日

社区洞察

其他会员也浏览了

LLMs for Simulated User Feedback, Causal AI, AI Slide Decks from ODSC East, and Low Code Time Series Analysis

??? GraphRAG Evolves into StructRAG

Issue #296 - The ML Engineer ??

Navigating the Future of AI: Insights from The AI Summit London 2024

GenAI Weekly — Edition 25

Democratizing AI: How Hugging Face & KNIME Make It Easier

Issue #208 - THE ML ENGINEER???

AI, Test Right

Chapter 3: LLM Lifecycle, Installing an LLM, LLM Ops