Tracking models with MLflow

Share this article

Are you looking for an introduction on how to track models using MLflow? Then this is the article for you! In this article we discuss what MLflow model tracking is, why you should use Mlflow model tracking, and how to use MLflow to track your models. After that we will provide examples of how to track MLflow models and register them in the MLflow model registry. 

This article was created as part of a larger case study on developing production-ready machine learning models. That being said, it also serves as a great standalone resource on how to use MLflow model tracking to track your machine learning models. 

What is MLflow model tracking?

What is MLflow? MLflow is an open source framework that makes it easy to track the machine learning models that you train as well as the parameters, data, and metrics associated with those models. 

When you use MLflow model tracking, you can train a variety of different machine learning models then make predictions with them interchangeably using the standardized model prediction interface. You can also register your models in the MLflow model registry and keep track of which model is being used in production so that this information is easily accessible to everyone you are working with. 

In addition to its model training capabilities, MLflow also offers general data tracking capabilities that allow you to track the inputs and outputs of your scripts. We will not touch on these capabilities much in this article, but if you want to hear more then you should check out our article on tracking data with MLflow.

When is MLflow model tracking useful?

When is MLflow model tracking useful? Here are some use situations when MLflow model tracking would be useful.

  • When you want to compare models from different frameworks. Do you want to be able to use multiple different types of models built on top of different frameworks in the same script? The standardized MLflow model prediction interface makes it easy to do this with minimal code changes.
  • When your model has complex dependencies. Do you have models with complex dependencies that you want to track alongside your models? MLflow model tracking makes it easy to track your model dependencies alongside your model in plain text. 
  • When you work collaboratively with other data professionals. Are you working collaboratively with other data professionals that may be able to make use of your models? Then the MLflow model registry makes it easy to see what models your coworkers are working on.
  • When you plan to deploy new versions of your model periodically. Are you working in a setting where new versions of the same model are being tracked and deployed over time? Then the MLflow model registry can help you keep track of which model is being used in production. 

Why use MLflow model tacking?

Why is MLflow model tracking useful? Here are some reasons why MLflow model tracking is useful for data practitioners who are building machine learning models.

  • Make predictions for different models with a standard interface. MLflow defines a standard prediction interface that can be used by a wide variety of different models built on different frameworks. That means that you can multiple different models, such as a TensorFlow neural network and a Scikit Learn Random Forest, interchangeably in the same script with minimal code changes. This makes it easy to test models built on top of different frameworks against each other to see which model performs the best. 
  • Automatically load model dependencies with your model. MLflow model tracking makes it easy to track all of the dependencies that your model needs in order to run alongside your model. You can define a conda environment that contains all of the required packages then MLflow will automatically load these dependencies any time you load your model. This further enables you to plug different models into the same script without having to modify your import statements.
  • Discover models your coworkers are using. The MLflow model registry is a standardized location where you and your collaborators can stay up to date on what models have been built. When you are tasked with building a new model, you can browse the model registry to see whether anyone else has already built a similar model. 
  • Keep track of which model version is used in production. The MLflow model registry also makes it easy to track which version of your mode is being used in production. This information will also be available to your collaborators, making it easier for them to find the answers to their question on their own. 

What model metadata gets tracked

What model metadata gets tracked when you use MLflow model tracking? Here are some pieces of information that get tracked alongside the model file. 

  • Model training run. The ID associated with the run that your MLflow model was saved under is always tracked alongside the model. This makes it easy to go back and look at parameters, metrics, and data from the same model training run. This might include things like the parameters associated with the saved model and the performance metrics of the save model. 
  • Model dependencies. MLflow model tracking will also keep track of dependencies that is required to run your model. For example, if you save a SciKit Learn mode, MLflow will keep track of what version of SciKit Learn was used to train your model as well as any other dependencies that are required to run the model. 
  • Time created. The time that your model was created will also be tracked alongside your serialized model. This makes it easy to see how old your model is and determine whether the data it was trained on might be stale.  
  • Schema of inputs and outputs (optional). You can also track the schema of the inputs and outputs that are expected for your model. For example, you can specify the column names and types that are expected for the input data. This is not something that gets tracked by default, but rather an extra option that you can add if you want. 
  • Example input (optional). You can also track an example of what a model input should look like alongside your model. As before, this is not something that gets tracked by default.

What kind of models can you track in MLflow?

What kinds of models can you track with MLflow? MLflow model tracking provides a general model tracking interface that allows you to define custom models, so technically you can track any kind of model with MLflow. That being said, defining custom models does take a little bit of work. Many common modeling frameworks are natively supported in MLflow, meaning that can track them with minimal code changes. 

Here are the frameworks that are natively supported in MLflow

  • SciKit Learn
  • PyTorch
  • Keras 
  • Spark MLlib
  • Tensorflow
  • H20
  • ONNX
  • MLeap
  • XGBoost
  • LightGBM
  • CatBoost
  • Spacy
  • FastAI
  • Statsmodels
  • MXNet Gluon

Basic MLflow model tracking concepts

What are the basic MLflow concepts you need to understand before using MLflow model tracking? First there is the concept of an experiment. An experiment is defined by a collection of scripts and data that are logically grouped and should be tracked in the same location. Generally, you should create a new MLflow experiment for each project you are working on. 

Within an experiment, you can have multiple runs. Each time you run a script that has MLflow tracking incorporated, a new run will be created. Your runs can all be associated with the same script or you can have multiple different scripts that create runs for an experiment. When you save a MLflow model, that model will be associated with a specific run. 

Once you have a model that is associated with a specific run, you can promote that model to the model registry. The model registry will keep track of what version of the model is being used as well as the run where that model was originally saved. The model registry can also keep track of model metadata such as the version of the model that is deployed to your production environment. 

A diagram of what an MLflow experiment with multiple different runs might look like.

MLflow model tracking in Python

How do you use MLflow model tracking in Python? If you are using MLflow to track a model built on a framework that is natively supported by MLflow, then it is trivial to track your model. All you have to do to log a model to MLflow is find the namespace of the appropriate log_model function and call that function. For example, for a Scikit Learn model you will call the mlflow.sklearn.log_model function. 

Once you are ready to load your model, you just need to call the appropriate load_model function. For example, for a Scikit Learn model you would call mlflow.sklearn.load_model. After you load your model you can use the model’s predict function to make predictions on a data frame. The data frame should have the same columns as the data frame you used to train your model. 

Check out this documentation to see all of the natively supported model frameworks and find out how to call their log_model and load_mode functions. 

MLflow model tracking example

Now we will walk through an example of how to track a model using MLflow. We will be continuing on with the model training script that we used in the previous step of our case study where we set up MLflow tracking to keep track of the parameters and metrics associated with our model training runs. 

If you were just looking for a high level explanation of what MLflow model tracking is, you can drop off now. If you want to learn more about our case study, you should check out our case study overview for more details. 

0. Create you model training script

Before we can track a model in MLflow, we need to create a script that trains a model. If you tuned in for the previous step of our case study where we learned how to track basic data for our model training runs in MLflow, we will use the same script as we used for that step of the case study. This script reads some basic model parameters from a configuration file, reads the training data, and trains a random forest classifier on that data. This is what that model training script looks like. 

import os
import yaml

import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import roc_auc_score

from bank_deposit_classifier.sample import upsample_minority_class
from bank_deposit_classifier.prep_data import DATA_DIR, CONFIG_DIR

config_path = 'train_model.yaml'
test_path = 'intermediate/test.csv'
train_path = 'intermediate/train.csv'

# get parameters from config
config_path_full = os.path.join(CONFIG_DIR, config_path)
with open(config_path_full, 'r') as file:
    config = yaml.load(file, Loader=yaml.FullLoader)
outcome = config.get('outcome')
n_estimators = config.get('n_estimators')
max_features = config.get('max_features')

# get test and train data
test_path_full = os.path.join(DATA_DIR, test_path)
train_path_full = os.path.join(DATA_DIR, train_path)
test = pd.read_csv(test_path_full)
train = pd.read_csv(train_path_full)
train_resampled = upsample_minority_class(train, outcome, 0.5)

# train model
rf = RandomForestClassifier(
    n_estimators = n_estimators,
    max_features = max_features,
    random_state = 123
    ), axis=1), train_resampled[outcome])

# evaluate model
train_predictions = rf.predict(train_resampled.drop(outcome, axis=1))
test_predictions = rf.predict(test.drop(outcome, axis=1))
train_auc = roc_auc_score(train_resampled[outcome], train_predictions)
test_auc = roc_auc_score(test[outcome], test_predictions)

1. Start a local MLflow tracking server

After you create your model training script, it is time to start up the MLflow tracking server. The tracking server is the location where you will save your MLflow data to and view your MLflow data. The first thing you need to do in order to do this is type the following command into your terminal. 

mlflow server --backend-store-uri sqlite:///mlflow.db --default-artifact-root ./mlflow-artifact

If you already completed the previous step of our case study, you have already seen the default way that MLflow creates tracking servers when you use the “mlflow ui” command. We started out using the default tracking server where all of the metadata associated with your experiments is stored in individual files because it was the simplest way to display how MLflow tracking works. 

That being said, for this portion of the case study we will need to create a new tracking server where the data is stored in a SQL database rather than in individual files. This is because the MLflow model registry does not work unless your metadata is stored in a SQL database. We recommend using this SQL-backed tracking server from this point forward. 

In order to tell MLflow how to set up our tracking server, we will use the “mlflow server” command and provide two arguments. backend-store-uri represents the location and type of database we want to use to store high level metadata associated with our runs. default-artifact-root specifies a separate path where artifacts should be stored. A separate path is provided for artifacts because artifacts can be very large and therefore may need to be stored in a cloud-based data store such as S3 for some projects. 

After you start up the tracking server, you can go to your favorite browser to view the tracking server. Simply type http://localhost:5000 into the address bar on your favorite browser to view the MLflow tracking server. This is what the MLflow tracking server should look like. 

An example of what the mlflow tracking ui looks like.

2. Create an experiment

After you spin up the MLflow tracking server, you should click the plus sign at the top left of the screen to create a new experiment. You will be asked to provide a name for your experiment and optionally a location where you want your data to be tracked. We called our experiment case-study-one. You can leave the artifact location field empty for now. 

Note that if you have been following along with our case study and you have already created an MLflow experiment to track data with MLflow, you should still create a new experiment to ensure that your experiment uses the SQL-based tracking server. If you want your experiment to have the same name, you can simply delete your previous experiment and create a new experiment with the same name.

A picture demonstrating how to create an MLflow experiment. There is a text box on the screen that asks the user to put in a name for their MLflow experiment.

3. Start a MLflow run

Now that you have created an experiment to store your data, it is time to add some code to your model training script. The first thing you will need to do is import MLflow and tell MLflow where your model tracking server is and what the name of your experiment is. After that, you should start a new MLflow run. You can use the following code to do that. 

import mlflow


4. Add the model tracking code

Now we will add the code to track our Scikit Learn random forest model. MLflow model tracking natively supports Scikit Learn models, so this is going to be easy. All we have to do is use the log_model function and provide our model and the name of the folder where our model should be saved as arguments. We will save our model in a directory called “model”. 

mlflow.sklearn.log_model(rf, 'model')

5. End you MLflow run

After you log your model, the only thing that is left is to end your MLflow run. Simply add a line of code that calls MLflow’s end_run function. 


6. Promote your model to the model registry

Now that we are done adding code to our model training script, we will run the script and spin up the MLflow tracking UI. We can look at our most recent run in our case-study-one experiment and see our logged model. 

An example of a SciKit Learn MLflow model that has been logged to a MLflow run.

In addition to confirming that the MLflow model has been logged appropriately, we can also add the model to the MLflow model registry using the MLflow tracking UI. All you need to do to add your model to the model registry is press the “Register Model” button that appears in the upper left when you view the model artifact. 

You will be prompted to enter the name of the model. If this is the first time you are adding a model to the model registry, you will need to create a new model and give it a name. You should use a name that makes it easy for others to identify what the model does. We will call our model bank-deposit-classifer. 

The screen you will see when you register an MLflow model. There is a box that asks if you want to create a new model and then a box that asks for the name of your new model.

If you click over to the models tab on the top of the MLflow UI, you will see that you now have a new model in your model registry. If you click on the name of your model, you will see a list of different versions of the model. You can click into a specific version of the model then you will be able to see metadata associated with that model such as the run where it was trained. You will also be able to mark it as the version that is being staged for production or the version that is being used in production. 

An example of what the MLflow model registry looks like. There is one model registered that is called bank-deposit-classifier.

7. Load your model

Now that you have saved your MLflow model, you can load the model in any script or notebook. Since we saved our model using the built in Scikit Learn utilities, we will load our model using the same built in utilities. There are two ways we can load our model in. The first way loads our model using an MLflow run and the second way uses the model registry.

The first way to load your model is to reference the run that the model was saved under. This is the only way to load a model that has not been saved in the model registry. You can load a model this way by providing the following path to the load_model function “runs:/<mlflow_run_id>/<path_to_model>” where mlflow_run_id is the run id your model was saved under and path_to_model is the relative path to that model within the run’s artifacts directory. If you are loading a model from a run, you also need to specify the experiment it was logged under and the tracking server location. 

We stored our model in a directory named “model” so our relative path would be “model”. The code we would use to load our model would look something like this. 

path = "runs:/427e70ff5e6e48aeb741f98e7dba42b4/model"
model = mlflow.sklearn.load_model(path)

The other way to load your model is through the MLflow model registry. The function you will use will be the same, but the path you will provide will be different. To load a model from the model registry, the path will be “models:/<model_name>/<version>” where model_name is the name of your model in the model registry and version is the version of the model in the model registry. 

path = "models:/bank-deposit-classifier/1"
model = mlflow.sklearn.load_model(path)

Share this article

Leave a Comment

Your email address will not be published. Required fields are marked *