MLOps for Data Scientists

MLOps for Data Scientists

A few of my colleagues in data science are hesitant about embracing @MLOps. Why should it matter to them? ? Actually a lot!

This article presents a comprehensive overview of MLOps, especially from a data scientist's perspective. Essentially, MLOps aims to address common issues of reliability and clarity that frequently arise during the development and deployment of machine learning models.

AI productization

MLOps encompasses a suite of tools that facilitate the lifecycle of data-centric AI. This includes training models, performing error analysis to pinpoint data types where the algorithm underperforms, expanding the dataset through data augmentation, resolving discrepancies in data label definitions, and leveraging production data for ongoing model enhancement.

MLOps aims to streamline and automate the training and validation of machine learning models, enhancing their quality and ensuring they meet business and regulatory standards. It merges the roles of data engineering, data science, and dev-ops into a cohesive and predictable process across the following domains:

  • Deployment and automation
  • Reproducibility of models and predictions
  • Diagnostics
  • Governance and regulatory compliance (Socs-2, @HIPAA)
  • Scalability and latency
  • Collaboration
  • Business use cases & metrics
  • Monitoring and management
  • Technical support

Predictable ML lifecycle

MLOps outlines the management of the entire machine learning lifecycle. This includes integrating model generation with software development processes (like JIRA, Github), ensuring continuous testing and delivery, orchestrating and deploying models, as well as monitoring their health, diagnostics, performance governance, and aligning with business metrics. From a data science standpoint, MLOps involves a consistent and cyclical process of gathering and preprocessing data, training and assessing models, and deploying them in a production environment.

Data-centric AI

Andrew Ng pioneered the idea of data-centric AI, advocating for AI professionals to prioritize the quality of their training data rather than concentrating mainly on model or algorithm development. Unlike the conventional model-centric AI approach, where data is gathered with minimal focus on its quality to train and validate a model, @data-centric AI emphasizes improving data quality. This approach enhances the likelihood of success for AI projects and machine learning models in practical applications.

MLOps, on the other hand, involves a continuous and iterative process encompassing data collection and pre-processing, model training and evaluation, and deployment in a production environment.

Fig 1. Overview of continuous development in data-centric AI - courtesy Andrew Ng

There are several difference between the traditional model-centric AI and data centric AI approaches.

Model Centric:

  • Goal is to collect all the data you can and develop a model good enough to deal with noise to avoid overfitting.
  • Hold the data fixed and iteratively improve the model and code.

Data Centric:

  • Goal is to select a subset of the training data with the highest consistency and reliability so multiple models performs well.
  • Hold the model and code fixes and iteratively improve the data.

Repeatable processes

The objective is to implement established and reliable software development management techniques (such as Scrum, Kanban, etc.) and DevOps best practices in the training and validation of machine learning models. By operationalizing the training, tuning, and validation processes, the automation of data pipelines becomes more manageable and predictable.

The diagram below showcases how data acquisition, analysis, training, and validation tasks transition into operational data pipelines:

Fig 2. Productization in Model-centric AI

As shown in Figure 2, the deployment procedure in a model-centric AI framework offers limited scope for integrating model training and validation with fresh data.?

Fig 3. Productization in Data-centric AI

Conversely, in a data-centric AI approach, Figure 3, the model is put into action early in the development cycle. This early deployment facilitates ongoing integration and updates to the model(s), utilizing?feedback?and newly acquired data.

AI lifecycle management tools

While the development tools traditionally used by software engineers are largely applicable to MLOps, there has been an introduction of specialized tools for the ML lifecycle in recent years. Several open-source tools have emerged in the past three years to facilitate the adoption and implementation of MLOps across engineering teams.

  • DVC?(Data Version Control) is tailored for version control in ML projects.
  • Polyaxon?offers lifecycle automation for data scientists within a collaborative workspace.
  • MLFlow?oversees the complete ML lifecycle, from experimentation to deployment, and features a model registry for managing different model versions.
  • Kubeflow?streamlines workflow automation and deployment in Kubernetes containers.
  • Metaflow?focuses on automating the pipeline and deployment processes.

Additionally,?AutoML?frameworks are increasingly popular for swift ML development, offering a user experience akin to GUI development.

Canary, frictionless release

A strong testing and deployment strategy is essential for the success of any AI initiative. Implementing a canary release smoothens the transition of a model from a development or staging environment to production. This method involves directing a percentage of user requests to a new version or a sandbox environment based on criteria set by the product manager (such as modality, customer type, metrics, etc.).?

This strategy minimizes the risk of deployment failures since it eliminates the need for rollbacks. If issues arise, it's simply a matter of ceasing traffic to the new version.

References


---------------------------

Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning.? He has been director of data engineering at Aideo Technologies since 2017 and he is the?author of "Scala for Machine Learning" Packt Publishing ISBN 978-1-78712-238-3


#mlops #devops #MLFlow #datacentricAI #machinelearning #lifecycle

要查看或添加评论,请登录

Patrick Nicolas的更多文章

  • Riemannian Manifolds for Geometric Learning

    Riemannian Manifolds for Geometric Learning

    Intrigued by the idea of applying differential geometry to machine learning but feel daunted? Beyond theoretical…

  • Einstein Summation in Geometric Deep Learning

    Einstein Summation in Geometric Deep Learning

    The einsum function in NumPy and PyTorch, which implements Einstein summation notation, provides a powerful and…

  • Visualization of Graph Neural Networks

    Visualization of Graph Neural Networks

    Have you ever found it challenging to represent a graph from a very large dataset while building a graph neural network…

  • Modeling Graph Neural Networks with PyTorch

    Modeling Graph Neural Networks with PyTorch

    Have you ever wondered how to get started with Graph Neural Networks (GNNs)? Torch Geometric (PyG) provides a…

  • Approximating PCA on Manifolds

    Approximating PCA on Manifolds

    Have you ever wondered how to perform Principal Component Analysis on manifolds? An approximate solution relies on the…

  • Reviews of Papers on Geometric Learning - 2024

    Reviews of Papers on Geometric Learning - 2024

    2024 introduced a fascinating collection of papers on geometric deep learning. Here are reviews of a selection of them.

    1 条评论
  • Fréchet Centroid on Manifolds in Python

    Fréchet Centroid on Manifolds in Python

    The Fréchet centroid (or intrinsic centroid) is a generalization of the concept of a mean to data points that lie on a…

  • Einstein Summation in Numpy

    Einstein Summation in Numpy

    Many research papers use Einstein summation notation to describe mathematical concepts. Wouldn't it be great to have a…

  • Deep Learning on Mac Laptop

    Deep Learning on Mac Laptop

    The latest high-performance Mac laptops are well-suited for experimentation. However, have you been frustrated by your…

    1 条评论
  • Impact of Linear Activation on Convolution Networks

    Impact of Linear Activation on Convolution Networks

    Have you ever wondered how choosing an activation function can influence the performance of a convolutional neural…

社区洞察

其他会员也浏览了