How to Version and Deploy Models with MLflow Model Registry

You trained a model. It’s good. Now what? You need a place to store it, track which version is running in production, and swap versions without redeploying your entire stack. MLflow Model Registry handles all of this – versioning, aliasing, metadata tagging, and serving – through a Python API and CLI that work with any ML framework.

This guide covers the full workflow: logging a model, registering it, assigning aliases like champion and challenger, loading specific versions for inference, and spinning up a REST endpoint with one command.

Set Up MLflow Tracking

Before you register anything, you need an MLflow tracking server. For local development, a SQLite-backed server works fine.

1
2
3
4
5
6
7
8
pip install mlflow scikit-learn

# Start the tracking server with a local SQLite backend
mlflow server \
  --backend-store-uri sqlite:///mlflow.db \
  --default-artifact-root ./mlartifacts \
  --host 0.0.0.0 \
  --port 5000

Point your client at it:

1
2
3
import mlflow

mlflow.set_tracking_uri("http://localhost:5000")

If you skip this, MLflow writes to a local mlruns/ directory. That works for experimentation but breaks down fast when multiple people need access to the same models.

Train and Register a Model

The fastest path is to register the model in the same call that logs it. Pass registered_model_name to log_model() and MLflow creates the registered model (if it doesn’t exist) and adds a new version automatically.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("iris-classifier")

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

with mlflow.start_run():
    model = RandomForestClassifier(n_estimators=150, max_depth=5, random_state=42)
    model.fit(X_train, y_train)

    accuracy = accuracy_score(y_test, model.predict(X_test))
    mlflow.log_param("n_estimators", 150)
    mlflow.log_param("max_depth", 5)
    mlflow.log_metric("accuracy", accuracy)

    # This registers the model and creates version 1
    mlflow.sklearn.log_model(
        sk_model=model,
        artifact_path="model",
        registered_model_name="iris-random-forest",
    )
    print(f"Accuracy: {accuracy:.4f}")

Every subsequent call with the same registered_model_name increments the version number. Version 1, version 2, version 3 – each tied to a specific run with its own parameters, metrics, and artifacts.

If you already logged a model and want to register it after the fact, use mlflow.register_model() with the run URI:

1
2
3
4
5
result = mlflow.register_model(
    model_uri="runs:/<run-id>/model",
    name="iris-random-forest",
)
print(f"Registered version: {result.version}")

Assign Aliases Instead of Stages

MLflow used to have a stage system – Staging, Production, Archive. That’s deprecated since MLflow 2.9. Use aliases instead. They’re more flexible: you can assign multiple aliases to a single version, name them whatever you want, and reassign them without any state machine constraints.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
from mlflow import MlflowClient

client = MlflowClient(tracking_uri="http://localhost:5000")

# Promote version 1 to production
client.set_registered_model_alias("iris-random-forest", "champion", 1)

# Set up a challenger for A/B testing
client.set_registered_model_alias("iris-random-forest", "challenger", 2)

# Check which version an alias points to
champion_info = client.get_model_version_by_alias("iris-random-forest", "champion")
print(f"Champion is version {champion_info.version}")

# Swap champion to version 2 after it wins the A/B test
client.set_registered_model_alias("iris-random-forest", "champion", 2)

# Clean up the old alias
client.delete_registered_model_alias("iris-random-forest", "challenger")

The @alias syntax works in model URIs, so your deployment code never needs to change version numbers:

1
2
3
4
5
import mlflow.pyfunc

# Always loads whatever version has the "champion" alias
model = mlflow.pyfunc.load_model("models:/iris-random-forest@champion")
predictions = model.predict(X_test)

This is the key advantage over hardcoded version numbers. Your serving infrastructure points at @champion. When you promote a new version, you update the alias and the next request picks up the new model. No redeploy needed.

Tag Versions with Metadata

Tags let you attach arbitrary key-value metadata to registered models and individual versions. Use them to track validation status, training datasets, or approval workflows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
client = MlflowClient(tracking_uri="http://localhost:5000")

# Tag the registered model
client.set_registered_model_tag("iris-random-forest", "team", "ml-platform")
client.set_registered_model_tag("iris-random-forest", "task", "classification")

# Tag a specific version
client.set_model_version_tag("iris-random-forest", "1", "validation_status", "approved")
client.set_model_version_tag("iris-random-forest", "1", "dataset", "iris-v2-2026-02")

# Update the description for documentation
client.update_model_version(
    name="iris-random-forest",
    version=1,
    description="RandomForest with 150 trees, trained on Iris v2 dataset. Accuracy: 0.9667",
)

Serve a Model as a REST API

MLflow can serve any registered model as a REST endpoint with a single CLI command. It builds a virtual environment with pinned dependencies and exposes a /invocations endpoint.

1
2
3
4
5
# Serve the champion version
mlflow models serve \
  -m "models:/iris-random-forest@champion" \
  --port 5001 \
  --no-conda

The --no-conda flag skips creating a conda environment and uses your current Python environment. Drop it if you want full environment isolation.

Test it with curl:

1
2
3
4
5
6
7
8
curl -X POST http://localhost:5001/invocations \
  -H "Content-Type: application/json" \
  -d '{
    "dataframe_split": {
      "columns": ["sepal_length", "sepal_width", "petal_length", "petal_width"],
      "data": [[5.1, 3.5, 1.4, 0.2], [6.7, 3.0, 5.2, 2.3]]
    }
  }'

The response comes back as a JSON array of predictions. Use dataframe_split format – it preserves column ordering, unlike dataframe_records.

Search and Compare Versions

When you have dozens of versions, you need to search and filter them programmatically.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
from mlflow import MlflowClient

client = MlflowClient(tracking_uri="http://localhost:5000")

# List all versions of a model
versions = client.search_model_versions("name='iris-random-forest'")
for v in versions:
    print(f"Version {v.version} | Status: {v.status} | Aliases: {v.aliases}")

# Search all registered models
for rm in client.search_registered_models():
    print(f"{rm.name}: {len(rm.latest_versions)} versions")

# Promote across environments using copy
client.copy_model_version(
    src_model_uri="models:/iris-random-forest@champion",
    dst_name="iris-random-forest-prod",
)

The copy_model_version call is useful when you run separate MLflow instances for staging and production. It copies the model artifact and metadata to a different registered model name, which can live on a different tracking server.

Common Errors and Fixes

RESOURCE_DOES_NOT_EXIST when loading a model:

1
mlflow.exceptions.MlflowException: Registered Model with name='iris-random-forest' not found

This means the model name doesn’t exist in the registry. Check for typos. Model names are case-sensitive. You can verify with client.search_registered_models().

Model registry functionality is unavailable:

1
2
mlflow.exceptions.MlflowException: Model registry functionality is unavailable;
got unsupported URI '' for model registry data storage

You’re running MLflow without a proper backend store. The file-based default (mlruns/) supports the registry, but you need to specify it explicitly. Start the server with --backend-store-uri sqlite:///mlflow.db or point to a PostgreSQL/MySQL instance.

RESOURCE_ALREADY_EXISTS on experiment creation:

1
2
mlflow.exceptions.MlflowException: RESOURCE_ALREADY_EXISTS:
Experiment(name=iris-classifier) already exists

This happens when you call mlflow.create_experiment() for an experiment that already exists. Use mlflow.set_experiment() instead – it creates the experiment if missing and sets it as active if it exists.

Version not found after registering:

If register_model() returns successfully but get_model_version() fails immediately after, it’s a race condition with the backend store. Add a short retry loop or check the version status:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import time

result = mlflow.register_model("runs:/<run-id>/model", "iris-random-forest")

# Wait for the version to become READY
for _ in range(10):
    v = client.get_model_version("iris-random-forest", result.version)
    if v.status == "READY":
        break
    time.sleep(1)

Putting It All Together

A realistic workflow looks like this: train a model in an experiment, register it, tag the version with metadata, run validation, assign the champion alias, and serve it. When a better model comes along, register a new version, alias it as challenger, run an A/B test, and swap the champion alias over. Your serving code never changes – it always loads @champion. The registry keeps a complete history of every version, who promoted it, and what metrics it achieved.

That’s the core MLflow Model Registry loop. It’s not fancy, but it solves the “which model is running in production right now?” problem without building custom infrastructure.

Set Up MLflow Tracking#

Train and Register a Model#

Assign Aliases Instead of Stages#

Tag Versions with Metadata#

Serve a Model as a REST API#

Search and Compare Versions#

Common Errors and Fixes#

Putting It All Together#

Related Guides#

About the Author