How to Build LLM Output Filtering with Guardrails and Validators

Validate LLM Outputs Before They Ship

Your LLM will eventually return garbage – malformed JSON, leaked PII, toxic language, or just wrong answers. Guardrails AI gives you a validation layer that sits between your LLM call and your users. You define validators, attach them to a Guard, and the library handles checking (and optionally fixing) every response.

Install it and grab a couple of hub validators:

1
2
3
4
5
pip install guardrails-ai==0.9.0
guardrails hub install hub://guardrails/regex_match
guardrails hub install hub://guardrails/toxic_language
guardrails hub install hub://guardrails/detect_pii
guardrails hub install hub://guardrails/valid_range

Here’s the simplest possible guard – validate that an LLM response matches a regex pattern:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from guardrails import Guard, OnFailAction
from guardrails.hub import RegexMatch

guard = Guard().use(
    RegexMatch(regex="^[A-Z].*\\.$", on_fail=OnFailAction.EXCEPTION)
)

result = guard.validate("This is a valid sentence.")
print(result.validated_output)
# "This is a valid sentence."

If the output doesn’t start with a capital letter and end with a period, the guard raises an exception. That’s the core pattern: create a Guard, attach validators with .use(), and call .validate() on any string.

Wrap OpenAI Calls with a Guard

The real power shows up when you wrap your LLM calls directly. Instead of calling OpenAI and then validating separately, the Guard handles both steps – and can automatically re-ask the LLM if validation fails.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from guardrails import Guard, OnFailAction
from guardrails.hub import ToxicLanguage, DetectPII

guard = Guard().use_many(
    ToxicLanguage(
        threshold=0.5,
        validation_method="sentence",
        on_fail=OnFailAction.FIX
    ),
    DetectPII(
        pii_entities=["EMAIL_ADDRESS", "PHONE_NUMBER", "SSN"],
        on_fail=OnFailAction.FIX
    ),
)

result = guard(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful customer service agent."},
        {"role": "user", "content": "Summarize my account details."},
    ],
    temperature=0.3,
)

print(result.validated_output)
print(f"Validation passed: {result.validation_passed}")

The guard() call sends the messages to OpenAI, gets the response, runs it through both validators, and returns a ValidationOutcome. With on_fail=OnFailAction.FIX, the toxic language validator scrubs flagged sentences and the PII detector redacts emails, phone numbers, and SSNs instead of raising an error.

Structured Output with Pydantic Models

When your LLM needs to return structured data, combine Guardrails with Pydantic to enforce both schema and content rules:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import json
from pydantic import BaseModel, Field
from guardrails import Guard, OnFailAction
from guardrails.hub import ValidRange, RegexMatch

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(
        description="Rating from 1 to 5",
        validators=[ValidRange(min=1, max=5, on_fail=OnFailAction.FIX)]
    )
    summary: str = Field(
        description="One-sentence summary of the review",
        validators=[RegexMatch(
            regex="^[A-Z].*\\.$",
            on_fail=OnFailAction.REASK
        )]
    )
    recommended: bool

guard = Guard.from_pydantic(ProductReview)

result = guard(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": "Review the Sony WH-1000XM5 headphones. "
                       "Return JSON with product_name, rating (1-5), "
                       "summary (one sentence), and recommended (bool).",
        }
    ],
    num_reasks=2,
)

print(type(result.validated_output))
# <class 'dict'>
print(json.dumps(result.validated_output, indent=2))

Guard.from_pydantic() reads the model’s fields and validators. When the LLM returns a rating of 7, ValidRange with on_fail=OnFailAction.FIX clamps it to the valid range. When the summary doesn’t match the regex, OnFailAction.REASK sends the validation error back to the LLM and asks it to try again – up to num_reasks times.

Build Custom Validators

Hub validators cover common cases, but you’ll eventually need custom logic. Guardrails supports both function-based and class-based validators.

Function-based, for simple checks:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from typing import Dict
from guardrails.validators import (
    FailResult,
    PassResult,
    register_validator,
    ValidationResult,
)

@register_validator(name="no-competitor-mentions", data_type="string")
def no_competitor_mentions(value: str, metadata: Dict) -> ValidationResult:
    competitors = ["acme corp", "widgetco", "megastore"]
    value_lower = value.lower()
    found = [c for c in competitors if c in value_lower]
    if found:
        scrubbed = value
        for name in found:
            scrubbed = scrubbed.replace(name, "[COMPETITOR]")
            scrubbed = scrubbed.replace(name.title(), "[COMPETITOR]")
        return FailResult(
            error_message=f"Mentions competitors: {', '.join(found)}",
            fix_value=scrubbed,
        )
    return PassResult()

Class-based, for validators that need configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
from typing import Callable, Dict, List, Optional
from guardrails.validators import (
    FailResult,
    PassResult,
    register_validator,
    ValidationResult,
    Validator,
)

@register_validator(name="max-sentence-count", data_type="string")
class MaxSentenceCount(Validator):
    def __init__(
        self,
        max_sentences: int = 5,
        on_fail: Optional[Callable] = None,
    ):
        super().__init__(on_fail=on_fail, max_sentences=max_sentences)
        self.max_sentences = max_sentences

    def _validate(self, value: str, metadata: Dict) -> ValidationResult:
        sentences = [s.strip() for s in value.split(".") if s.strip()]
        if len(sentences) > self.max_sentences:
            truncated = ". ".join(sentences[: self.max_sentences]) + "."
            return FailResult(
                error_message=(
                    f"Output has {len(sentences)} sentences, "
                    f"max allowed is {self.max_sentences}."
                ),
                fix_value=truncated,
            )
        return PassResult()

Use them the same way as hub validators:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
from guardrails import Guard, OnFailAction

guard = Guard().use_many(
    no_competitor_mentions,
    MaxSentenceCount(max_sentences=3, on_fail=OnFailAction.FIX),
)

result = guard.validate(
    "Acme Corp makes great widgets. Their products are top-notch. "
    "We recommend them highly. They beat all competitors. "
    "Nothing else compares."
)
print(result.validated_output)
# "[COMPETITOR] makes great widgets. Their products are top-notch. We recommend them highly."

The response gets both competitor names scrubbed and excess sentences trimmed in a single pass.

On-Fail Actions Explained

Every validator takes an on_fail parameter that controls what happens when validation fails. Pick the right one for each use case:

Action	Behavior	Best For
`OnFailAction.EXCEPTION`	Raises `ValidationError`	Hard requirements where bad output is unacceptable
`OnFailAction.REASK`	Sends error back to LLM for retry	Format issues the LLM can self-correct
`OnFailAction.FIX`	Uses the validator’s `fix_value`	PII redaction, competitor scrubbing
`OnFailAction.FIX_REASK`	Tries fix first, reasks if fix still fails	Complex corrections
`OnFailAction.NOOP`	Logs the failure but passes output through	Monitoring without blocking
`OnFailAction.FILTER`	Removes the failing value entirely	List items that fail validation
`OnFailAction.REFRAIN`	Returns `None` instead of output	When no output is better than bad output

You can also pass a custom function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from guardrails import Guard
from guardrails.hub import RegexMatch

def log_and_fix(value, fail_result):
    print(f"VALIDATION FAILED: {fail_result.error_message}")
    return value.upper()  # custom fix logic

guard = Guard().use(
    RegexMatch(regex="^[A-Z ]+$", on_fail=log_and_fix)
)

Common Errors and Fixes

guardrails hub install fails with “validator not found”. You need to authenticate first. Run guardrails configure and enter your API key from the Guardrails Hub dashboard. Some validators are community-contributed and may have different names than expected – search at hub.guardrailsai.com.

ValidationError raised unexpectedly in production. Don’t use OnFailAction.EXCEPTION unless you wrap the guard call in try/except. For production services, OnFailAction.FIX or OnFailAction.NOOP are safer defaults – they degrade gracefully instead of crashing your API.

num_reasks causes slow responses. Each reask is a full LLM round-trip. Set num_reasks=1 for latency-sensitive endpoints. For batch processing, num_reasks=3 is reasonable. Never set it higher than 5 – if the LLM can’t get it right in 5 tries, your prompt needs work.

Pydantic validation errors on nested models. Guard.from_pydantic() works with nested Pydantic models, but validators on nested fields need to be specified in the Field definition, not via .use(). Use .use() for top-level string validators and Field validators for property-level checks.

DetectPII misses domain-specific PII. The hub PII validator uses standard NER patterns. For custom identifiers like internal account numbers or proprietary IDs, write a custom validator with regex patterns specific to your data format.

Validate LLM Outputs Before They Ship#

Wrap OpenAI Calls with a Guard#

Structured Output with Pydantic Models#

Build Custom Validators#

On-Fail Actions Explained#

Common Errors and Fixes#

Related Guides#

About the Author