Going Deeper with Pydantic: Nested Models and Data Structures
In Post 1, we explored the basics of Pydantic: creating models, enforcing type validation, and ensuring data integrity with minimal boilerplate. But real-world applications often involve more complex, structured data—like API payloads, configuration files, or nested JSON. How do we handle a blog post with comments, an order with multiple items, or a user profile with nested addresses? This post dives into Pydantic’s powerful support for nested models and smart data structures, showing how to model, validate, and access complex data with ease. We’ll cover practical examples, including a blog system with authors and comments, and touch on use cases like user profiles or e-commerce orders. Let’s get started! Nested BaseModels Pydantic allows you to define models within models, enabling clean, hierarchical data structures. Let’s model a blog system with an Author, Comment, and Blog model. from pydantic import BaseModel from datetime import datetime class Author(BaseModel): name: str email: str class Comment(BaseModel): content: str author: Author created_at: datetime class Blog(BaseModel): title: str content: str author: Author comments: list[Comment] = [] # Example usage blog_data = { "title": "Nested Models in Pydantic", "content": "This is a blog post about Pydantic...", "author": {"name": "Jane Doe", "email": "jane@example.com"}, "comments": [ { "content": "Great post!", "author": {"name": "John Smith", "email": "john@example.com"}, "created_at": "2025-05-04T10:00:00" } ] } blog = Blog(**blog_data) print(blog.author.name) # Jane Doe print(blog.comments[0].author.email) # john@example.com Here, Comment and Blog embed the Author model, and Pydantic automatically validates the nested data. If author.email is invalid (e.g., not a string), validation fails before the model is instantiated. This cascading validation ensures every layer of your data is correct. Lists, Tuples, and Sets of Models Nested models often involve collections, like a list of comments on a blog. Pydantic supports List[T], Tuple[T, ...], and Set[T] for collections of models or other types. Using our Blog model, notice the comments: list[Comment] = []. Pydantic validates each Comment in the list: invalid_comment_data = { "title": "Invalid Comment Example", "content": "This blog has a bad comment...", "author": {"name": "Jane Doe", "email": "jane@example.com"}, "comments": [ { "content": "This is fine", "author": {"name": "John Smith", "email": "john@example.com"}, "created_at": "2025-05-04T10:00:00" }, { "content": "This is bad", "author": {"name": "Bad Author", "email": "not-an-email"}, # Invalid email "created_at": "2025-05-04T10:01:00" } ] } try: blog = Blog(**invalid_comment_data) except ValueError as e: print(e) Pydantic will raise a ValidationError pinpointing the invalid email in the second comment. You can also use Tuple[Comment, ...] for immutable sequences or Set[Comment] for unique items, and validation works the same way. Optional Fields and Defaults Real-world data often includes optional fields or defaults. Pydantic supports Optional[T] from typing and allows default values. from typing import Optional class Author(BaseModel): name: str email: Optional[str] = None # Email is optional bio: str = "No bio provided" # Default value class Blog(BaseModel): title: str content: str author: Author # Example with missing email blog_data = { "title": "Optional Fields", "content": "This blog has an author with no email.", "author": {"name": "Jane Doe"} } blog = Blog(**blog_data) print(blog.author.email) # None print(blog.author.bio) # No bio provided Optional[str] means the field can be None or a string, while email: str = None implies the field is optional but defaults to None. Pydantic distinguishes between missing fields (not in the input) and fields explicitly set to None, ensuring precise control over data parsing. Dict and Map-Like Structures Pydantic supports Dict[K, V] for key-value structures, perfect for feature flags, localized content, or other mappings. from typing import Dict class Blog(BaseModel): title: str content: str translations: Dict[str, str] # Language code -> translated title blog_data = { "title": "Pydantic Power", "content": "This is a blog post...", "translations": { "es": "El poder de Pydantic", "fr": "La puissance de Pydantic" } } blog = Blog(**blog_data) print(blog.translations["es"]) # El poder de Pydantic You can also nest models in dictionaries, like Dict[str, Author], for more complex mappings. Pydantic validates both keys and values according to their types. Accessing Nested

In Post 1, we explored the basics of Pydantic: creating models, enforcing type validation, and ensuring data integrity with minimal boilerplate. But real-world applications often involve more complex, structured data—like API payloads, configuration files, or nested JSON. How do we handle a blog post with comments, an order with multiple items, or a user profile with nested addresses? This post dives into Pydantic’s powerful support for nested models and smart data structures, showing how to model, validate, and access complex data with ease.
We’ll cover practical examples, including a blog system with authors and comments, and touch on use cases like user profiles or e-commerce orders. Let’s get started!
Nested BaseModels
Pydantic allows you to define models within models, enabling clean, hierarchical data structures. Let’s model a blog system with an Author
, Comment
, and Blog
model.
from pydantic import BaseModel
from datetime import datetime
class Author(BaseModel):
name: str
email: str
class Comment(BaseModel):
content: str
author: Author
created_at: datetime
class Blog(BaseModel):
title: str
content: str
author: Author
comments: list[Comment] = []
# Example usage
blog_data = {
"title": "Nested Models in Pydantic",
"content": "This is a blog post about Pydantic...",
"author": {"name": "Jane Doe", "email": "jane@example.com"},
"comments": [
{
"content": "Great post!",
"author": {"name": "John Smith", "email": "john@example.com"},
"created_at": "2025-05-04T10:00:00"
}
]
}
blog = Blog(**blog_data)
print(blog.author.name) # Jane Doe
print(blog.comments[0].author.email) # john@example.com
Here, Comment
and Blog
embed the Author
model, and Pydantic automatically validates the nested data. If author.email
is invalid (e.g., not a string), validation fails before the model is instantiated. This cascading validation ensures every layer of your data is correct.
Lists, Tuples, and Sets of Models
Nested models often involve collections, like a list of comments on a blog. Pydantic supports List[T]
, Tuple[T, ...]
, and Set[T]
for collections of models or other types.
Using our Blog
model, notice the comments: list[Comment] = []
. Pydantic validates each Comment
in the list:
invalid_comment_data = {
"title": "Invalid Comment Example",
"content": "This blog has a bad comment...",
"author": {"name": "Jane Doe", "email": "jane@example.com"},
"comments": [
{
"content": "This is fine",
"author": {"name": "John Smith", "email": "john@example.com"},
"created_at": "2025-05-04T10:00:00"
},
{
"content": "This is bad",
"author": {"name": "Bad Author", "email": "not-an-email"}, # Invalid email
"created_at": "2025-05-04T10:01:00"
}
]
}
try:
blog = Blog(**invalid_comment_data)
except ValueError as e:
print(e)
Pydantic will raise a ValidationError
pinpointing the invalid email in the second comment. You can also use Tuple[Comment, ...]
for immutable sequences or Set[Comment]
for unique items, and validation works the same way.
Optional Fields and Defaults
Real-world data often includes optional fields or defaults. Pydantic supports Optional[T]
from typing
and allows default values.
from typing import Optional
class Author(BaseModel):
name: str
email: Optional[str] = None # Email is optional
bio: str = "No bio provided" # Default value
class Blog(BaseModel):
title: str
content: str
author: Author
# Example with missing email
blog_data = {
"title": "Optional Fields",
"content": "This blog has an author with no email.",
"author": {"name": "Jane Doe"}
}
blog = Blog(**blog_data)
print(blog.author.email) # None
print(blog.author.bio) # No bio provided
Optional[str]
means the field can be None
or a string, while email: str = None
implies the field is optional but defaults to None
. Pydantic distinguishes between missing fields (not in the input) and fields explicitly set to None
, ensuring precise control over data parsing.
Dict and Map-Like Structures
Pydantic supports Dict[K, V]
for key-value structures, perfect for feature flags, localized content, or other mappings.
from typing import Dict
class Blog(BaseModel):
title: str
content: str
translations: Dict[str, str] # Language code -> translated title
blog_data = {
"title": "Pydantic Power",
"content": "This is a blog post...",
"translations": {
"es": "El poder de Pydantic",
"fr": "La puissance de Pydantic"
}
}
blog = Blog(**blog_data)
print(blog.translations["es"]) # El poder de Pydantic
You can also nest models in dictionaries, like Dict[str, Author]
, for more complex mappings. Pydantic validates both keys and values according to their types.
Accessing Nested Data Safely
Once validated, Pydantic models provide type-safe access to nested attributes. You can access fields like blog.author.name
or blog.comments[0].content
without worrying about KeyError
or AttributeError
.
For serialization, use .dict()
(or .model_dump()
in Pydantic V2) with options like exclude_unset
, include
, or exclude
:
# Serialize only specific fields
print(blog.dict(include={"title", "author": {"name"}}))
# Output: {'title': 'Pydantic Power', 'author': {'name': 'Jane Doe'}}
# Exclude unset fields
blog = Blog(
title="Test",
content="Content",
author=Author(name="Jane")
)
print(blog.dict(exclude_unset=True))
# Only includes fields explicitly set, skips defaults like author.bio
This makes it easy to control what data is serialized for APIs or storage.
Validation and Error Reporting in Nested Structures
Pydantic’s error reporting is precise, even for nested data. Let’s revisit the invalid comment example:
try:
blog = Blog(**invalid_comment_data)
except ValueError as e:
print(e.errors())
Output might look like:
[
{
'loc': ('comments', 1, 'author', 'email'),
'msg': 'value is not a valid email address',
'type': 'value_error.email'
}
]
The loc
field shows the exact path to the error (comments[1].author.email
), making it easy to debug complex structures. This granularity is invaluable for APIs or user-facing validation.
Recap and Takeaways
Nested models in Pydantic make it easy to handle complex, structured data with robust validation. Key techniques:
- Use
BaseModel
for nested structures likeAuthor
inBlog
. - Leverage
List[T]
,Dict[K, V]
, andOptional[T]
for flexible data shapes. - Access nested data safely with dot notation or serialize with
.dict()
. - Rely on Pydantic’s detailed error reporting for debugging.
These tools are perfect for APIs, configuration files, or any scenario with hierarchical data.