Bad inputs cause silent model failures. A NaN sneaks into a feature vector, your model returns garbage predictions, and nobody notices until a downstream system breaks. Or someone sends a 50,000-character string to your text classifier and your GPU runs out of memory mid-inference.

The fix is simple: validate before inference, not after. Pydantic v2 gives you fast, declarative schemas with custom validators. FastAPI uses those schemas as request bodies automatically. Together, they form a validation layer that rejects bad data before it ever touches your model.

1
pip install fastapi uvicorn pydantic scikit-learn numpy

Basic Input Validation with Pydantic

Start with a Pydantic model that describes exactly what your ML endpoint expects. Say you’re serving a tabular model that predicts house prices from four numeric features.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from pydantic import BaseModel, Field, ConfigDict
from typing import List


class HousePredictionRequest(BaseModel):
    model_config = ConfigDict(strict=True)

    square_feet: float = Field(gt=0, le=100_000, description="Living area in sq ft")
    bedrooms: int = Field(ge=0, le=20)
    bathrooms: float = Field(ge=0, le=15)
    year_built: int = Field(ge=1800, le=2030)


class BatchPredictionRequest(BaseModel):
    instances: List[HousePredictionRequest] = Field(
        min_length=1, max_length=100, description="Batch of 1-100 instances"
    )


# This passes
valid = HousePredictionRequest(
    square_feet=1500.0, bedrooms=3, bathrooms=2.0, year_built=1995
)

# This fails: square_feet must be > 0
# HousePredictionRequest(square_feet=-100, bedrooms=3, bathrooms=2.0, year_built=1995)

Field(gt=0, le=100_000) enforces that square footage is positive and within a sane range. ConfigDict(strict=True) disables type coercion, so sending a string where a float is expected raises an error instead of silently converting it. The batch model caps requests at 100 instances to protect your server from huge payloads.

Custom Validators for ML-Specific Checks

Built-in constraints handle simple ranges. For ML-specific checks – detecting NaN values, validating embedding dimensions, enforcing text length limits – you need custom validators.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import math
from pydantic import BaseModel, Field, field_validator, model_validator
from typing import List


class EmbeddingRequest(BaseModel):
    text: str = Field(min_length=1, max_length=5000)
    embedding: List[float] = Field(min_length=384, max_length=384)

    @field_validator("embedding")
    @classmethod
    def no_nan_or_inf(cls, v: List[float]) -> List[float]:
        for i, val in enumerate(v):
            if math.isnan(val) or math.isinf(val):
                raise ValueError(f"embedding[{i}] is NaN or Inf, got {val}")
        return v

    @field_validator("text")
    @classmethod
    def text_not_whitespace(cls, v: str) -> str:
        stripped = v.strip()
        if not stripped:
            raise ValueError("text cannot be only whitespace")
        return stripped


class FeatureVectorRequest(BaseModel):
    features: List[float] = Field(min_length=4, max_length=4)
    model_version: str = Field(pattern=r"^v\d+\.\d+$")

    @model_validator(mode="after")
    def features_in_valid_range(self) -> "FeatureVectorRequest":
        for i, val in enumerate(self.features):
            if math.isnan(val) or math.isinf(val):
                raise ValueError(f"features[{i}] contains NaN or Inf")
            if not (-1e6 <= val <= 1e6):
                raise ValueError(
                    f"features[{i}] = {val} is outside allowed range [-1e6, 1e6]"
                )
        return self

The @field_validator decorator runs on a single field. Use it when the check only involves that one value. The @model_validator(mode="after") decorator runs after all fields are parsed, so you can do cross-field checks or validate the full feature vector as a unit. The mode="after" keyword means Pydantic has already done type coercion and basic validation – you’re working with the final typed values.

One thing to watch: @field_validator methods must be @classmethod. Pydantic v2 enforces this. If you forget the decorator, you’ll get a cryptic TypeError at import time.

FastAPI Integration with Model Serving

Now wire these schemas into a FastAPI app. Use the lifespan context manager to load your model once at startup and clean it up on shutdown.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
import numpy as np
from contextlib import asynccontextmanager
from fastapi import FastAPI
from pydantic import BaseModel, Field, field_validator, ConfigDict
from typing import List
import math
import pickle
from sklearn.linear_model import LinearRegression


# --- Train and save a demo model (run once) ---
def create_demo_model():
    X = np.array([[1500, 3, 2, 1995], [2000, 4, 3, 2005], [1200, 2, 1, 1980]])
    y = np.array([300000, 450000, 200000])
    model = LinearRegression().fit(X, y)
    with open("house_model.pkl", "wb") as f:
        pickle.dump(model, f)


# --- Pydantic schemas ---
class HouseInput(BaseModel):
    model_config = ConfigDict(strict=True)

    square_feet: float = Field(gt=0, le=100_000)
    bedrooms: int = Field(ge=0, le=20)
    bathrooms: float = Field(ge=0, le=15)
    year_built: int = Field(ge=1800, le=2030)

    @field_validator("square_feet", "bathrooms")
    @classmethod
    def no_nan_or_inf(cls, v: float) -> float:
        if math.isnan(v) or math.isinf(v):
            raise ValueError(f"Value must be a finite number, got {v}")
        return v


class PredictionResponse(BaseModel):
    predicted_price: float
    model_version: str


# --- App with lifespan ---
ml_models = {}


@asynccontextmanager
async def lifespan(app: FastAPI):
    # Load model on startup
    with open("house_model.pkl", "rb") as f:
        ml_models["house"] = pickle.load(f)
    ml_models["version"] = "v1.0"
    yield
    # Cleanup on shutdown
    ml_models.clear()


app = FastAPI(title="House Price API", lifespan=lifespan)


@app.post("/predict", response_model=PredictionResponse)
async def predict(request: HouseInput):
    model = ml_models["house"]
    features = np.array(
        [[request.square_feet, request.bedrooms, request.bathrooms, request.year_built]]
    )
    prediction = model.predict(features)[0]
    return PredictionResponse(
        predicted_price=round(float(prediction), 2),
        model_version=ml_models["version"],
    )


@app.get("/health")
async def health():
    return {"status": "ok", "model_loaded": "house" in ml_models}

Run with uvicorn main:app --host 0.0.0.0 --port 8000. The lifespan context manager replaced the deprecated @app.on_event("startup") pattern. Everything before yield runs at startup, everything after runs at shutdown. The model loads once into the ml_models dict and every request reuses it.

FastAPI validates the request body against HouseInput automatically. If validation fails, it returns a 422 response with detailed error info. If it passes, your endpoint gets a fully typed, validated object.

Handling Validation Errors Gracefully

FastAPI’s default 422 responses are verbose. For ML clients, you often want cleaner error messages that tell the caller exactly what to fix.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
from fastapi.exceptions import RequestValidationError


@app.exception_handler(RequestValidationError)
async def ml_validation_error_handler(
    request: Request, exc: RequestValidationError
):
    errors = []
    for error in exc.errors():
        field = " -> ".join(str(loc) for loc in error["loc"] if loc != "body")
        errors.append(
            {
                "field": field,
                "message": error["msg"],
                "your_input": error.get("input"),
            }
        )
    return JSONResponse(
        status_code=422,
        content={
            "error": "input_validation_failed",
            "detail": errors,
            "hint": "Check field constraints and types. All numeric features must be finite numbers within their allowed ranges.",
        },
    )

Now when a client sends {"square_feet": -50, "bedrooms": 3, "bathrooms": 2.0, "year_built": 1995}, they get back:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "error": "input_validation_failed",
  "detail": [
    {
      "field": "square_feet",
      "message": "Input should be greater than 0",
      "your_input": -50
    }
  ],
  "hint": "Check field constraints and types. All numeric features must be finite numbers within their allowed ranges."
}

That’s actionable. The client knows which field failed, why, and what they sent. Much better than a raw traceback or a generic 400.

Common Errors and Fixes

Numpy arrays in Pydantic models. Pydantic doesn’t know how to serialize np.ndarray. If your model returns numpy types, convert them first.

1
2
3
4
5
6
# This breaks:
# return {"prediction": model.predict(features)}

# This works:
prediction = model.predict(features)[0]
return {"prediction": float(prediction)}

Always cast numpy scalars with float(), int(), or .tolist() before returning them from FastAPI endpoints.

Strict mode rejects valid inputs. If you enable ConfigDict(strict=True), Pydantic won’t coerce "3" to 3. That’s usually what you want for ML inputs – implicit type coercion hides bugs. But if your clients send JSON with string numbers, you’ll need to either drop strict mode or tell them to fix their serialization.

Async model loading with large files. Loading a multi-gigabyte model file in the lifespan function blocks the event loop. For large models, offload the loading to a thread:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import asyncio
from concurrent.futures import ThreadPoolExecutor

executor = ThreadPoolExecutor(max_workers=1)

@asynccontextmanager
async def lifespan(app: FastAPI):
    loop = asyncio.get_event_loop()
    ml_models["house"] = await loop.run_in_executor(
        executor, load_heavy_model, "house_model.pkl"
    )
    yield
    ml_models.clear()

def load_heavy_model(path: str):
    with open(path, "rb") as f:
        return pickle.load(f)

Validation error messages are hard to parse. Pydantic v2 error messages follow a specific format. The loc field in each error is a tuple like ("body", "square_feet"). When you flatten it for your custom error handler, skip the "body" prefix since it’s noise for API clients.

Field order matters for cross-field validation. In a @field_validator, info.data only contains fields that were already validated (fields defined above the current one in the class). If your validator depends on another field, make sure that field is defined first in the model class.