How to Build Structured Output Parsers with Pydantic and LLMs

LLMs return strings. Your application needs typed data – integers, lists, nested objects with specific fields. The gap between “here’s some JSON-ish text” and “here’s a validated Python object” is where most LLM integration bugs live. Pydantic closes that gap. You define what the data should look like, and Pydantic either gives you a clean object or tells you exactly what went wrong.

There are two paths here: use OpenAI’s structured output mode (which constrains token generation at the model level), or parse and validate the raw text yourself. You’ll probably need both.

Using OpenAI’s Structured Output Mode

The cleanest approach is client.beta.chat.completions.parse(). You pass a Pydantic model as the response_format, and the SDK handles schema conversion, API communication, and response parsing in one call. The model is constrained to produce valid output matching your schema – it cannot generate tokens that would violate it.

Here’s a real example: extracting structured fields from a product review.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
from openai import OpenAI
from pydantic import BaseModel, field_validator
from typing import Literal


class ProductReview(BaseModel):
    sentiment: Literal["positive", "negative", "mixed"]
    rating: int
    pros: list[str]
    cons: list[str]
    one_line_summary: str

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be 1-5, got {v}")
        return v

    @field_validator("pros", "cons")
    @classmethod
    def at_least_one_item(cls, v: list[str]) -> list[str]:
        if len(v) < 1:
            raise ValueError("Must include at least one item")
        return v


client = OpenAI()

review_text = """
The Sony WH-1000XM5 headphones have incredible noise cancellation
and the sound quality is warm and detailed. Battery lasts about 30 hours.
However, they don't fold flat anymore which is annoying for travel,
and the touch controls are overly sensitive -- I keep pausing music
by accident when adjusting the fit.
"""

completion = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {
            "role": "system",
            "content": "Extract a structured review from the user's text. Be specific in pros and cons.",
        },
        {"role": "user", "content": review_text},
    ],
    response_format=ProductReview,
)

review = completion.choices[0].message.parsed
print(f"Sentiment: {review.sentiment}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.one_line_summary}")

The parsed attribute gives you an actual ProductReview instance, not a dict. Your field validators still run, so you get both schema enforcement from the API side and business logic validation from Pydantic. If the model returns a rating of 7, the validator catches it.

One thing to watch: response_format with Pydantic models requires all fields to have defaults or be marked required. Optional fields need None as a default. The SDK converts your model to a JSON schema with additionalProperties: false and every field in required.

Manual JSON Extraction with Fallbacks

Structured output mode is great when you have it. But Anthropic’s Claude, local models via Ollama, and older OpenAI models don’t support it. For those, you need to pull JSON out of the response text yourself.

The typical failure mode: the LLM wraps the JSON in markdown code fences, adds a chatty preamble like “Here’s the extracted data:”, or truncates the response mid-object. This function handles all of those cases.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
import json
import re
from pydantic import BaseModel, ValidationError
from typing import TypeVar, Type

T = TypeVar("T", bound=BaseModel)


def extract_model(text: str, model_class: Type[T], max_retries: int = 2) -> T:
    """Extract and validate a Pydantic model from LLM response text.

    Handles markdown code fences, extra text around JSON, and retries
    with progressively aggressive extraction.
    """
    json_str = _find_json(text)

    for attempt in range(max_retries + 1):
        try:
            data = json.loads(json_str)
            return model_class.model_validate(data)
        except json.JSONDecodeError:
            # Try stripping more aggressively on each retry
            json_str = _aggressive_extract(text, attempt)
        except ValidationError as e:
            if attempt == max_retries:
                raise
            # On validation error, the JSON parsed but fields are wrong.
            # Re-raise on final attempt, otherwise try re-extraction.
            json_str = _aggressive_extract(text, attempt)

    raise ValueError(f"Could not extract valid {model_class.__name__} from response")


def _find_json(text: str) -> str:
    """Pull JSON from markdown code fences or raw text."""
    # Match ```json ... ``` or ``` ... ```
    fence_match = re.search(r"```(?:json)?\s*\n?(.*?)\n?\s*```", text, re.DOTALL)
    if fence_match:
        return fence_match.group(1).strip()

    # Try to find raw JSON object
    brace_start = text.find("{")
    brace_end = text.rfind("}")
    if brace_start != -1 and brace_end != -1:
        return text[brace_start : brace_end + 1]

    return text.strip()


def _aggressive_extract(text: str, attempt: int) -> str:
    """Progressively more aggressive JSON extraction."""
    if attempt == 0:
        # Strip everything before first { and after last }
        brace_start = text.find("{")
        brace_end = text.rfind("}")
        if brace_start != -1 and brace_end != -1:
            return text[brace_start : brace_end + 1]
    # Last resort: try to find any JSON-like structure
    match = re.search(r"\{[^{}]*(?:\{[^{}]*\}[^{}]*)*\}", text, re.DOTALL)
    if match:
        return match.group(0)
    return text


# Usage with a real model call
class MovieRecommendation(BaseModel):
    title: str
    year: int
    genre: str
    reason: str
    confidence: float = 0.5  # default if LLM omits it


raw_response = """
Sure! Here's a movie recommendation for you:

```json
{
  "title": "Arrival",
  "year": 2016,
  "genre": "sci-fi",
  "reason": "Linguistics-based first contact story with a non-linear narrative"
}

Hope you enjoy it! """

movie = extract_model(raw_response, MovieRecommendation) print(f"{movie.title} ({movie.year}) - {movie.reason}") print(f"Confidence: {movie.confidence}") # Uses default 0.5

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

The key insight: Pydantic's `model_validate` accepts a dict and applies all validators and defaults. So even if the LLM skips your `confidence` field, Pydantic fills it in. This is much more forgiving than structured output mode, where schema mismatches cause API errors.

## Nested Models and Complex Schemas

Real-world extraction often involves nested structures. A resume has a list of education entries, each with its own fields. Pydantic handles this naturally with nested models.

The trick is using `model_json_schema()` to generate a JSON schema you can paste into your prompt. This tells the LLM exactly what structure you expect, without relying on structured output mode.

```python
from pydantic import BaseModel
from typing import Optional


class Education(BaseModel):
    institution: str
    degree: str
    field_of_study: str
    start_year: int
    end_year: Optional[int] = None  # None if still enrolled


class Skill(BaseModel):
    name: str
    level: str  # "beginner", "intermediate", "advanced", "expert"
    years_experience: int


class ResumeData(BaseModel):
    name: str
    email: str
    current_title: str
    education: list[Education]
    skills: list[Skill]
    total_years_experience: int


# Generate the schema to include in your prompt
import json

schema = ResumeData.model_json_schema()
schema_text = json.dumps(schema, indent=2)

prompt = f"""Extract structured data from the following resume text.
Return ONLY valid JSON matching this exact schema:

{schema_text}

Resume text:
---
Jane Park
[email protected]
Senior ML Engineer

Education:
- MS Computer Science, Stanford University, 2018-2020
- BS Mathematics, UC Berkeley, 2014-2018

Skills: Python (expert, 8 years), PyTorch (advanced, 5 years),
Kubernetes (intermediate, 3 years), SQL (advanced, 6 years)

Total experience: 6 years
---"""

# Works with any LLM -- OpenAI, Anthropic, local models
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": prompt}],
    response_format={"type": "json_object"},
)

raw_json = json.loads(response.choices[0].message.content)
resume = ResumeData.model_validate(raw_json)

print(f"Name: {resume.name}")
print(f"Education entries: {len(resume.education)}")
for edu in resume.education:
    end = edu.end_year or "present"
    print(f"  - {edu.degree} in {edu.field_of_study}, {edu.institution} ({edu.start_year}-{end})")
print(f"Skills: {[s.name for s in resume.skills]}")

Sending the full JSON schema in the prompt works surprisingly well even with smaller models. The schema gives the LLM a concrete target instead of vague instructions like “return the data as JSON.” And because model_json_schema() includes type information, descriptions, and constraints, the LLM knows that end_year is optional and that years_experience should be an integer, not a string.

Common Errors and Fixes

These are the errors you’ll actually hit in production.

Pydantic ValidationError for wrong types – The LLM returns "rating": "four" instead of "rating": 4. This is the most common failure.

1
2
3
pydantic_core._pydantic_core.ValidationError: 1 validation error for ProductReview
rating
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='four', input_type=str]

Fix: add a pre-validator that coerces string numbers, or use BeforeValidator to handle it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from pydantic import BaseModel, BeforeValidator
from typing import Annotated


def coerce_to_int(v):
    """Convert string numbers and word numbers to int."""
    word_map = {"one": 1, "two": 2, "three": 3, "four": 4, "five": 5}
    if isinstance(v, str):
        v = v.strip().lower()
        if v in word_map:
            return word_map[v]
        return int(v)
    return v


class FlexibleReview(BaseModel):
    rating: Annotated[int, BeforeValidator(coerce_to_int)]
    comment: str


# Now both work:
review1 = FlexibleReview.model_validate({"rating": 4, "comment": "Great"})
review2 = FlexibleReview.model_validate({"rating": "four", "comment": "Great"})
print(review1.rating, review2.rating)  # 4 4

json.JSONDecodeError from partial responses – The LLM hit its token limit and returned truncated JSON. You get something like {"name": "Jane", "skills": ["Py. There’s no way to recover the data from a truncated response. Your options: increase max_tokens, reduce the schema size, or catch the error and retry.

LLM adds text before or after JSON – “Here is the extracted data: {...}”. The _find_json function from the manual extraction section handles this. If you’re using response_format={"type": "json_object"}, OpenAI won’t add extra text, but other providers might.

Nested validation errors – When a field inside a list item fails, the error path looks like education -> 0 -> start_year. Read the full validation error path to find which item caused the problem. Pydantic v2 error messages include the exact path and input value, so you can feed the whole error string back into a retry prompt and the LLM usually fixes it on the next attempt.

Using OpenAI’s Structured Output Mode#

Manual JSON Extraction with Fallbacks#

Common Errors and Fixes#

Related Guides#

About the Author

Using OpenAI’s Structured Output Mode

Manual JSON Extraction with Fallbacks

Common Errors and Fixes

Related Guides