MLflow Part 3 - Logging Models to a Tracking Server!
David Hundley
Staff Machine Learning Engineer at State Farm ? | AI/ML Blogger | Livecoding Streamer
Hey there, friends, and welcome back to another post in our series on MLflow. If this is the first post you’ve seen and would like to catch up, be sure to check out the previous posts here:
As always, if you would like to see the code mentioned in this post, please be sure to check out my GitHub repo here.
This latest post is going to build right on top of part 2, so please do check that out if you missed it. Just to quickly recap what we did in that post, we deployed an MLflow tracking server to Kubernetes with Minikube on our local machines. Behind the scenes, the MLflow tracking server is supported by a Postgres metadata store and an AWS S3-like artifact store called Minio. That post was quite meaty, so I’m happy to share this one is much simpler by comparison. Phew!
I personally like pictures much better than a static explanation, so integrating what we just mentioned in the last paragraph, this architectural image summarizes how we’ll be interacting with MLflow in this particular post. (In case you’re not familiar with the icons, the elephant is PostgreSQL, and the red flamingo-like icon is Minio.)
Of course, we already did everything in the right side of the image in our last post, so this post is all about focusing on what you need to include for the Python file from your client machine. Of course, everything is coincidentally is all hosted on the local machine since we’re using making use of Minikube, but if you ever use a legit Kubernetes environment, your client will most likely be separate from the MLflow tracking server.
This code is actually pretty simple, and a lot of it is bound to look familiar to you. Partially because a lot of it is basic machine learning stuff, and partially because you probably already saw much of this when looking at Part 1 of this series. Because the code is so simple, I’m going to paste the full thing here and walk through the special stuff you might be new to.
# Importing in necessary libraries import os import pandas as pd from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score from sklearn.linear_model import ElasticNet import mlflow import mlflow.sklearn # PROJECT SETUP # ------------------------------------------------------------------------------ # Setting the MLflow tracking server mlflow.set_tracking_uri('https://mlflow-server.local') # Setting the requried environment variables os.environ['MLFLOW_S3_ENDPOINT_URL'] = 'https://mlflow-minio.local/' os.environ['AWS_ACCESS_KEY_ID'] = 'minio' os.environ['AWS_SECRET_ACCESS_KEY'] = 'minio123' # Loading data from a CSV file df_wine = pd.read_csv('../data/wine/train.csv') # Separating the target class ('quality') from remainder of the training data X = df_wine.drop(columns = 'quality') y = df_wine[['quality']] # Splitting the data into training and validation sets X_train, X_val, y_train, y_val = train_test_split(X, y, random_state = 42) # MODEL TRAINING AND LOGGING # ------------------------------------------------------------------------------ # Defining model parameters alpha = 1 l1_ratio = 1 # Running MLFlow script with mlflow.start_run(): # Instantiating model with model parameters model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio) # Fitting training data to the model model.fit(X_train, y_train) # Running prediction on validation dataset preds = model.predict(X_val) # Getting metrics on the validation dataset rmse = mean_squared_error(preds, y_val) abs_error = mean_absolute_error(preds, y_val) r2 = r2_score(preds, y_val) # Logging params and metrics to MLFlow mlflow.log_param('alpha', alpha) mlflow.log_param('l1_ratio', l1_ratio) mlflow.log_metric('rmse', rmse) mlflow.log_metric('abs_error', abs_error) mlflow.log_metric('r2', r2) # Logging training data mlflow.log_artifact(local_path = '../data/wine/train.csv') # Logging training code mlflow.log_artifact(local_path = './mlflow-wine.py') # Logging model to MLFlow mlflow.sklearn.log_model(sk_model = model, artifact_path = 'wine-pyfile-model', registered_model_name = 'wine-pyfile-model')
Right at the top, you’ll notice that we first have to make sure that the script is pointing to the proper tracking server. When we deployed our tracking server in the last post, you might recall we had the tracking server itself behind the URI mlflow-server.local and the artifact store (Minio) served out behind mlflow-minio.local. As a reminder, Minio intentionally emulates AWS’s S3, so in case you’re wondering why we’re setting AWS-like environment variables, that is why.
After loading in the data and doing some basic modeling, we come down to all the MLflow tracking goodness. MLflow is pretty flexible here, so you’ll notice we’re logging / uploading all this great stuff including…
- Parameters
- Metrics
- The code we used to run this model
- The training data itself
- The model itself
Speaking of that last one, you’ll notice some special syntax around model naming. This is because in addition to getting the model artifacts in the artifact registry, MLflow will also create a formal model in its MLflow Model Registry. We’ll briefly touch on that below, but we’ll explore that further in a future post. (Stay tuned!)
Alright, if everything works alright, all you need to do is run the following command:
python mlflow-wine.py
I’ve run this particular file multiple times now, so here is the output my terminal is showing me. If you’re running this for the first time, you’ll see something similar but obviously just a tiny different:
Registered model 'wine-pyfile-model' already exists. Creating a new version of this model... Created version '3' of model 'wine-pyfile-model'.
And friends, that’s really it! But before we wrap up this post, let’s jump into the UI just to see that everything worked properly. Fire up your browser and jump on over to mlflow-server.local. You should be greeted with a screen that looks like this:
If you followed along with Part 1 of this series, this is going to look really familiar. Go ahead and open one of those runs by clicking on the proper hyperlink. If all is well, you should see all the proper information you just logged, including the model artifacts we just created. Nice!
One other thing we couldn’t cover in Part 1 was the new Models tab located to the top left of the UI. Click on that, and you should see something like this:
This UI is pretty cool. Not only can you provide a thorough description of the model, but you can also set specific versions of the model to different stages. For example, you can set “Version 2” to staging, “Version 3” to production, and “Version 1” to archive. This functionality is really awesome, and it will come very much in handy when we explore more things in future posts.
Of course, as great as the UI is, we really should be doing everything programmatically, and MLflow has our back in that space, too. But I think we’re at a good stopping place today, so we’ll go ahead and wrap up. In the next post, we’ll look at how to interact with this UI from a more programatic perspective, and then in 2 posts from now, we’ll start really cooking with gas by showing how we might deploy a model from this Model Registry into a real production environment. Until then, thanks for reading! Stay safe, friends!