MLOps is an engineering discipline that aims to unify ML systems development (dev) and ML systems deployment (ops) in order to standardize and streamline the continuous delivery of high-performing models in production.
Until recently, we were dealing with manageable amounts of data and a very small number of models at a small scale.
The tables are turning now, and we are embedding decision automation in a wide range of applications. This generates a lot of technical challenges that come from building and deploying ML-based systems.
In order to understand MLOps, we must first understand the ML systems lifecycle. The lifecycle involves several different teams of a data-driven organization.
From start to bottom, the following teams chip in:
- Business development or Product team —?defining business objective(s) with KPIs
- Data Engineering —?data acquisition and preparation.
- Data Science —?architecting ML solutions and developing models.
- IT or DevOps —?complete deployment setup, monitoring alongside scientists.What Problems Does MLOps Solve?Managing such systems at scale is not an easy task, and there are numerous bottlenecks that need to be taken care of. Following are the major challenges that teams are up against:
- There is a shortage of Data Scientists who are good at developing and deploying scalable web applications. There is a new profile of ML Engineers on the market these days that aims to serve this need. It is a sweet spot at the intersection of Data Science and DevOps.
- Changing business objectives in the model —There are many dependencies with the data continuously changing, maintaining performance standards of the model, and ensuring AI governance. It’s hard to keep up with the continuous model training and evolving business objectives.
- Communication gaps between technical and business teams with a hard-to-find common language to collaborate. Most often, this gap becomes the reason that big projects fail.
- Risk assessment?— there is a lot of debate going on around the black-box nature of such ML/DL systems. Often models tend to drift away from what they were initially intended to do. Assessing the risk/cost of such failures is a very important and meticulous step. For example, the cost of an inaccurate video recommendation on YouTube would be much lower compared to flagging an innocent person for fraud and blocking their account and declining their loan applications.What Skills Do You Need for MLOps?1. Framing ML problems from business objectives2. Architect ML and data solutions for the problem3. Data preparation and processing?—?part of data engineering4. Model training and experimentation?—?data science5. Building and automating ML pipelines6. Deploying models to the production system7. Monitor, optimize and maintain models