What is MLflow model logging and when do I use which logging method?

What is MLflow model logging and when do I use which logging method?

If you’ve trained a machine learning model and thought “Wow, this one is actually good!”, only to realize a few hours (or days) later that you have no idea what parameters or data you used - then congrats: you’re a real data scientist now.

I’ve been there too. You remember the model looked great. Maybe it was even the best one you’ve trained in days. But now? You’re scrolling through dozens of notebook runs, and… nothing. You didn’t save the seed. You forgot which feature set you used. And worst of all, you definitely didn’t save the model.

So, like most of us, you Google “how to track machine learning experiments,” and one name keeps popping up: MLflow.

So… What Is MLflow?

MLflow is an open-source framework that tries to help you manage the full machine learning lifecycle. It has tools for experiment tracking, logging models, managing model versions, even deployment.

But if you’re anything like me, you’re not deploying models every week. Sometimes you just want to:

  • Train a model,
  • Save it somewhere safe,
  • Track how well it performed,
  • And load it later without retraining everything from scratch.

That’s where model logging comes in.

What Is Model Logging?

The simplest explanation? Model logging means saving your trained model along with everything you need to understand what it is and how it was built.

Most ML libraries already let you save models as files. If you’ve worked with scikit-learn or XGBoost, you’ve probably used joblib or pickle to save a .pkl file. If you're using PyTorch, it's often a .pth file with the weights.

These are machine-readable files. You can’t open them like a text file, but your code can load them and continue right where you left off.

So what's the difference between saving and logging?

🔁 Saving a model means writing a file to disk.
📚 Logging a model means saving it and linking it to the run that produced it - along with metrics, parameters, tags, and more.

This might seem like a small difference, but if you've ever tried to find the right model file in a sea of .pkl files, you know how much that connection matters.

Three Ways to Log a Model with MLflow

Let’s look at your options, depending on the library or setup you’re using.

1. The Convenient Way: Built-in Loggers

If you’re using scikit-learn, PyTorch, XGBoost, LightGBM, or other popular libraries, MLflow has native support for logging your models with a single line:

mlflow.sklearn.log_model(model, artifact_path="model")

This is by far the easiest and most complete method. It saves the model file and registers important metadata like:

  • Which library and version was used
  • What environment is needed to load it again
  • The signature if you provide examples for inputs/outputs (Though model signatures could be a whole new post, here's the docs)

You can even load the model again later with:

mlflow.sklearn.load_model("runs:/<run_id>/model")

If this approach is available to you and you're starting out, I recommend using this. It's implemented from experts and catches many aspects you wouldn't think about when logging manually.

2. Manual Logging: Logging Artifacts Yourself

Sometimes, though, you're working with a library that MLflow doesn’t support out of the box. This happened to me recently with Darts, a time series library I used for forecasting.

In that case, the easiest workaround is:

# Save your model manually 
model.save("my_model.pkl")  
# Log it as an artifact 
mlflow.log_artifact("my_model.pkl")`

This won't store much metadata, but it’s a quick fix. Just make sure you:

  • Add a README or text artifact explaining how to load the model.
  • Maybe tag your Git commit or script version so you can trace it back.
  • And definitely write down which version of the library you used

3. The Flexible Way: Custom PyFunc Wrappers

If your model pipeline is more complex or if you want full control, MLflow lets you wrap your model using a custom Python class.

This means subclassing mlflow.pyfunc.PythonModel and implementing methods like:

import mlflow

class MyModel(mlflow.pyfunc.PythonModel):
    def predict(self, model_input: list[str], params=None) -> list[str]:
        return model_input

Full documentation and specifications can be found in the official documentation: https://mlflow.org/docs/latest/model/python_model

This method is great when:

  • You want to also log custom preprocessing that isn’t part of a standard pipeline.
  • You want to have all details about your custom model properly logged.
  • You care about making your logged model reproducible by others.

Yes, it’s more work. But if you’re building something for long-term use - or for a team - this is the cleanest and most maintainable solution.

When Should You Log a Model?

A common misconception is that model logging is only useful when you want to deploy.

In my experience, that’s not true.

Log your model when:

  • It was a really good run and you want to load it later.
  • The model took a long time (or money) to train.
  • You want to compare it against future experiments.
  • You’re collaborating with others and want to share it.

You don’t need to deploy every model. But logging lets you reuse and trace it, which is useful in most projects.

Final Thoughts: Save Yourself from Future You

We all want to write clean code and keep our files organized. But real projects get messy. Logging models is one of the easiest ways to stay sane when the experiment count grows.

So don’t just save the model file. Log it.

It might just save your future self (or your teammates) hours of digging through directories. No, I don't want to describe how I know this... Well, maybe I will some day.