Django under the hood: How Form validation works ?
In this article we will try to understand what happens inside Django 5.2 when you try to validate a Django Form. Warning: we are going deep into the Django' internals. You better have Django's 5.2 source code at hand. At the end I'll provide a summary for the big picture. Let's say we have this contact form: from django import forms class ContactForm(forms.Form): subject = forms.CharField(max_length=100) message = forms.CharField(widget=forms.Textarea) sender = forms.EmailField() cc_myself = forms.BooleanField(required=False) Part of the Django's magic is validating a form's input raw data. The Django form validation process is a multi-step process that ensures the submitted data is clean and meets all defined validation rules. Here's a detailed, step-by-step breakdown: Form Initialization When a form instance is created with data, Django begins preparing it for validation: form = ContactForm(request.POST) # Pass submitted data into the form request.POST contains the submitted data, which is temporarily stored in form.data for validation. request.POST returns a QueryDict with the POST data: Where does form.data attribute come from? It's implemented in the BaseForm class: # django/forms/forms.py class BaseForm(RenderableFormMixin): """ The main implementation of all the Form logic... """ # Some lines are not shown for simplification purposes. def __init__( self, data=None, empty_permitted=False, ): self.is_bound = data is not None or files is not None self.data = MultiValueDict() if data is None else data self._errors = None # Stores the errors after clean() has been called. self.fields = copy.deepcopy(self.base_fields) When you do: form = ContactForm(request.POST) # populating self.data It's the same as: form = ContactForm(data=request.POST) # populating self.dataform That's how form.data is populated: self.data = MultiValueDict() if data is None else data MultiValueDict and QueryDict are specialized dictionaries in Django used for handling request data, such as form submissions (e.g., request.POST) and query parameters (e.g., request.GET). Unlike regular dictionaries that map each key to a single value, these structures allow multiple values to be associated with a single key. They provide convenient methods like getlist() to retrieve all values safely, which is especially useful for handling form fields with multiple selections (e.g., checkboxes or multi-select dropdowns). These classes help manage dynamic, multi-value data in HTTP requests while ensuring safe access without raising errors for missing keys. So self.data is populated with the request data or is set to a MultiValueDict. It will either hold the request data in the form of a MultiValueDict or an empty MultiValueDict if no data is passed in. But here you're also doing another very important stuff. A) Is the form bound? If the form has data (or file) is_bound returns True. You'll need this later in the verification process. self.is_bound = data is not None or files is not None B) How to get the fields of the form? Here is the key line: self.fields = copy.deepcopy(self.base_fields) base_fields is populated thanks to a metaclass. Django implements a metaclass that collects all Field instances declared in a class. It's a little bit complicated, but let's go step by step. You have the metaclass in django/forms/forms.py: # django/forms/forms.py class DeclarativeFieldsMetaclass(MediaDefiningClass): """Collect Fields declared on the base classes.""" # Some lines are not shown for simplification purposes. def __new__(mcs, name, bases, attrs): # Collect fields from current class and remove them from attrs. attrs["declared_fields"] = { key: attrs.pop(key) for key, value in list(attrs.items()) if isinstance(value, Field) } new_class = super().__new__(mcs, name, bases, attrs) new_class.base_fields = declared_fields return new_class Everytime a new Form is instantiated the metaclass runs, because the Form class inherits from it. # django/forms/forms.py class Form(BaseForm, metaclass=DeclarativeFieldsMetaclass): "A collection of Fields, plus their associated data." # This is a separate class from BaseForm in order to abstract the way # self.fields is specified. So when the metaclass runs, you get: new_class.base_fields = declared_fields Which translates later to: self.fields = copy.deepcopy(self.base_fields) What is inside fields? A dictionary containing the form's field names as keys and field objects as values: { 'subject': , 'message ': , 'sender ': , 'cc_myself': } So in summary, after initialization, you get: is_bound() returns True

In this article we will try to understand what happens inside Django 5.2 when you try to validate a Django Form.
Warning: we are going deep into the Django' internals. You better have Django's 5.2 source code at hand. At the end I'll provide a summary for the big picture.
Let's say we have this contact form:
from django import forms
class ContactForm(forms.Form):
subject = forms.CharField(max_length=100)
message = forms.CharField(widget=forms.Textarea)
sender = forms.EmailField()
cc_myself = forms.BooleanField(required=False)
Part of the Django's magic is validating a form's input raw data.
The Django form validation process is a multi-step process that ensures the submitted data is clean and meets all defined validation rules. Here's a detailed, step-by-step breakdown:
Form Initialization
When a form instance is created with data, Django begins preparing it for validation:
form = ContactForm(request.POST) # Pass submitted data into the form
request.POST
contains the submitted data, which is temporarily stored in form.data
for validation.
request.POST
returns a QueryDict
with the POST
data:
<QueryDict: {
'csrfmiddlewaretoken': ['L8vOV2Dj4qJZD...'],
'subject': ['This is the subject'],
'message': ['This is the message'],
'sender': ['email@email.com'],
'cc_myself': ['False']
}>
Where does form.data
attribute come from?
It's implemented in the BaseForm
class:
# django/forms/forms.py
class BaseForm(RenderableFormMixin):
"""
The main implementation of all the Form logic...
"""
# Some lines are not shown for simplification purposes.
def __init__(
self,
data=None,
empty_permitted=False,
):
self.is_bound = data is not None or files is not None
self.data = MultiValueDict() if data is None else data
self._errors = None # Stores the errors after clean() has been called.
self.fields = copy.deepcopy(self.base_fields)
When you do:
form = ContactForm(request.POST) # populating self.data
It's the same as:
form = ContactForm(data=request.POST) # populating self.dataform
That's how form.data
is populated:
self.data = MultiValueDict() if data is None else data
MultiValueDict
and QueryDict
are specialized dictionaries in Django used for handling request data, such as form submissions (e.g., request.POST
) and query parameters (e.g., request.GET
). Unlike regular dictionaries that map each key to a single value, these structures allow multiple values to be associated with a single key. They provide convenient methods like getlist()
to retrieve all values safely, which is especially useful for handling form fields with multiple selections (e.g., checkboxes or multi-select dropdowns). These classes help manage dynamic, multi-value data in HTTP requests while ensuring safe access without raising errors for missing keys.
So self.data
is populated with the request data or is set to a MultiValueDict
. It will either hold the request data in the form of a MultiValueDict
or an empty MultiValueDict
if no data is passed in.
But here you're also doing another very important stuff.
A) Is the form bound?
If the form has data (or file) is_bound
returns True
. You'll need this later in the verification process.
self.is_bound = data is not None or files is not None
B) How to get the fields of the form?
Here is the key line:
self.fields = copy.deepcopy(self.base_fields)
base_fields
is populated thanks to a metaclass. Django implements a metaclass that collects all Field
instances declared in a class.
It's a little bit complicated, but let's go step by step.
- You have the metaclass in
django/forms/forms.py
:
# django/forms/forms.py
class DeclarativeFieldsMetaclass(MediaDefiningClass):
"""Collect Fields declared on the base classes."""
# Some lines are not shown for simplification purposes.
def __new__(mcs, name, bases, attrs):
# Collect fields from current class and remove them from attrs.
attrs["declared_fields"] = {
key: attrs.pop(key)
for key, value in list(attrs.items())
if isinstance(value, Field)
}
new_class = super().__new__(mcs, name, bases, attrs)
new_class.base_fields = declared_fields
return new_class
- Everytime a new
Form
is instantiated the metaclass runs, because theForm
class inherits from it.
# django/forms/forms.py
class Form(BaseForm, metaclass=DeclarativeFieldsMetaclass):
"A collection of Fields, plus their associated data."
# This is a separate class from BaseForm in order to abstract the way
# self.fields is specified.
So when the metaclass runs, you get:
new_class.base_fields = declared_fields
Which translates later to:
self.fields = copy.deepcopy(self.base_fields)
What is inside fields
?
A dictionary containing the form's field names as keys and field objects as values:
{
'subject': <django.forms.fields.CharField object at 0x007...>,
'message ': <django.forms.fields.CharField object at 0x005...>,
'sender ': <django.forms.fields.EmailField object at 0x002...>,
'cc_myself': <django.forms.fields.BooleanField object at 0x007...>
}
So in summary, after initialization, you get:
-
is_bound()
returnsTrue
if there is data or file -
data
with raw input data from the form -
fields
with the fields of the form
We'll see _errors
later on.
Validation Process
The validation process starts when you call is_valid()
in the form. You typically instantiate the form within your view:
# views.py
def form_view(request):
if request.method == "POST":
form = ContactForm(request.POST) # Instantiate form
# Pass submitted data
if form.is_valid(): # the validation process starts
subject = form.cleaned_data["subject"]
...
return render(request, "form.html", context)
else:
...
return render(request, "form.html", context)
There are two things to validate:
- Each field
- The form itself
As stated before, all starts with is_valid()
:
# django/forms/forms.py
def is_valid(self):
"""Return True if the form has no errors, or False otherwise."""
return self.is_bound and not self.errors
It checks if the form has been submitted and has no validation errors.
We have to test the condition, if both return True
, you can continue working with the form.
Is there something familiar? Yes!!!
We already seen is_bound
. If there is data or file, it will return True
.
Now we have to check if there have been some validation errors by calling self.errors
:
# django/forms/forms.py
@property
def errors(self):
"""Return an ErrorDict for the data provided for the form."""
if self._errors is None:
self.full_clean()
return self._errors
It ensures the form is fully validated before returning the collected errors. If there is no error, the full_clean()
method will be called, otherwise it returns self._errors
, which is populated by calling add_error()
when a validation error raises.
add_error()
adds validation errors to self._errors
, associating them with a specific field or as non-field errors.
# django/forms/forms.py
def add_error(self, field, error):
"""
Update the content of `self._errors`.
"""
# Some lines are not shown for simplification purposes.
for field, error_list in error.items():
if field not in self.errors:
if field != NON_FIELD_ERRORS and field not in self.fields:
raise ValueError(...)
if field == NON_FIELD_ERRORS:
self._errors[field] = self.error_class(...)
else:
self._errors[field] = self.error_class()
self._errors[field].extend(error_list)
if field in self.cleaned_data:
del self.cleaned_data[field]
This are the specific lines where:
- The error is added to
_errors
. - The field with an error is deleted from
cleaned_data
self._errors[field].extend(error_list)
if field in self.cleaned_data:
del self.cleaned_data[field]
Going back to self.errors
: if no error is raised, full_clean()
is called.
Full Clean
Here is where the validation actually starts. It will clean all of data
and populate _errors
and cleaned_data
.
# django/forms/forms.py
def full_clean(self):
"""
Clean all of self.data and populate self._errors and self.cleaned_data.
"""
self._errors = ErrorDict(renderer=self.renderer)
if not self.is_bound: # Stop further processing.
return
self.cleaned_data = {}
# If the form is permitted to be empty, and none of the form data
# has changed from the initial data, short circuit any validation.
if self.empty_permitted and not self.has_changed():
return
self._clean_fields()
self._clean_form()
self._post_clean()
Inside there are three methods called that make the cleaning: _clean_fields()
, _clean_form()
and _post_clean()
.
A) full_clean(self)
→ _clean_fields()
It processes and validates each form field, storing cleaned data and handling errors.
# django/forms/forms.py
def _clean_fields(self):
for name, bf in self._bound_items():
field = bf.field # This is the actual field (e.g. )
try:
self.cleaned_data[name] = field._clean_bound_field(bf)
if hasattr(self, "clean_%s" % name):
value = getattr(self, "clean_%s" % name)()
self.cleaned_data[name] = value
except ValidationError as e:
self.add_error(name, e)
- It loops through
_bound_items()
This method adds an abstraction through BoundField
objects that helps in rendering the field, among other things.
# django/forms/forms.py
def _bound_items(self):
"""Yield (name, bf) pairs, where bf is a BoundField object."""
for name in self.fields:
yield name, self[name]
BoundField
link a form field to its data, widget, errors, and metadata for rendering and validation. A BoundField
is an object that represents a form field that has been bound to the form (i.e., it is connected to the form’s data and ready for rendering). The BoundField
is what allows you to access field data, validation, and rendering. A BoundField
is a wrapper around a form field that allows the field to be bound to data in the context of a form.
_bound_items()
returns a generator object, it iterates thourgh fields
(a dictionary containing the form's field names as keys and field objects as values) and yields the tuple name, self[name]
.
If you iterate through _bound_items()
, you get this:
('subject ', <django.forms.boundfield.BoundField object at 0x00...>)
('message ', <django.forms.boundfield.BoundField object at 0x00...>)
('sender', <django.forms.boundfield.BoundField object at 0x00...>)
('cc_myself', <django.forms.boundfield.BoundField object at 0x00...>)
A BoundField
is created dynamically when a form field is accessed for the first time when doing self[name]
.
Remember this:
self
is a Django Form
instance, which means self[name]
triggers __getitem__()
in Django's BaseForm
class. Inside BaseForm
, self[name]
creates a BoundField
on demand. If form
is a form instance, when you do form[name]
you're accessing a BoundField object which is rendered like this:
type="text" name="subject" value="the subject" maxlength="100" required id="id_subject">
Let's go back to _clean_fields()
, the field is stored in the field
variable: field = bf.field
field
is the field of the BoundField
object:
in the case of a CharField
.
- The next step is calling
_clean_bound_field()
_clean_bound_field()
is a field
method. It receives bf
as an argument, which is a BoundField
instance object (it's validating a field, not the form).
# django/forms/fields.py
def _clean_bound_field(self, bf):
value = bf.initial if self.disabled else bf.data
return self.clean(value)
It determines the value to validate (using the initial value if the field is disabled or the submitted data otherwise), it stores it in value
and then validates and cleans it using clean(value)
.
Remember that in this context self
is a field instance and not a form instance: _clean_bound_field
calls and returns the clean
method of the field instance.
⫸ clean
# django/forms/fields.py
def clean(self, value):
"""
Validate the given value and return its "cleaned" value as an
appropriate Python object. Raise ValidationError for any errors.
"""
value = self.to_python(value)
self.validate(value)
self.run_validators(value)
return value
→ to_python(value)
Is the first step inside clean()
, and it is where Django converts the raw string value (field's raw input data) into the appropriate Python data type.
The implementation of this method will depend on the data type. For instance, let's review subject = forms.CharField(max_length=100)
.
# django/forms/fields.py
class CharField(Field):
def __init__(
self, *, max_length=None, min_length=None, empty_value="", **kwargs
):
self.max_length = max_length
self.min_length = min_length
self.empty_value = empty_value
if min_length is not None:
self.validators.append(validators.MinLengthValidator(int(min_length)))
if max_length is not None:
self.validators.append(validators.MaxLengthValidator(int(max_length)))
def to_python(self, value):
"""Return a string."""
if value not in self.empty_values:
value = str(value)
return value
It first makes some validations when the CharField
is instantiated.
For instance, max_length
is equal to 100, which is validated by:
if max_length is not None:
self.validators.append(validators.MaxLengthValidator(int(max_length)))
And to_python()
checks if it's in empty_values
, which is (None, "", [], (), {})
. It then converts the raw string value to a python string with value = str(value)
.
The implementation is different between field types.
Just to have a different perspective, this is the implementation of IntegerField
:
# django/forms/fields.py
class IntegerField(Field):
...
def to_python(self, value):
"""
Validate that int() can be called on the input. Return the result
of int() or None for empty values.
"""
value = super().to_python(value)
if value in self.empty_values:
return None
if self.localize:
value = formats.sanitize_separators(value)
# Strip trailing decimal and zeros.
try:
value = int(self.re_decimal.sub("", str(value)))
except (ValueError, TypeError):
raise ValidationError(self.error_messages["invalid"], code="invalid")
return value
→ validate(value)
Checks if value
is empty and the field is required
; if so, it raises a ValidationError
.
# django/forms/fields.py
def validate(self, value):
if value in self.empty_values and self.required:
raise ValidationError(self.error_messages["required"], code="required")
→ run_validators(value)
Runs all field validators on the value and raises a ValidationError
if any validator fails or returns None
if the value is in empty_values
.
# django/forms/fields.py
def run_validators(self, value):
if value in self.empty_values:
return
errors = []
for v in self.validators:
try:
v(value)
except ValidationError as e:
if hasattr(e, "code") and e.code in self.error_messages:
e.message = self.error_messages[e.code]
errors.extend(e.error_list)
if errors:
raise ValidationError(errors)
After clean()
we have a clean and validated value in the form of an appropriate Python data type.
Let's go back to _clean_fields()
: the clean value is stored in cleaned_data[name]
as a dictionary, that's why you can access validated data with form.cleaned_data.get("field_name")
. If there is an error it will be added to _errors
by calling add_error()
.
But we still have one extra validation to check inside the if
statement.
# django/forms/forms.py
def _clean_fields(self):
for name, bf in self._bound_items():
field = bf.field
try:
self.cleaned_data[name] = field._clean_bound_field(bf)
if hasattr(self, "clean_%s" % name):
value = getattr(self, "clean_%s" % name)()
self.cleaned_data[name] = value
except ValidationError as e:
self.add_error(name, e)
This code inside the if statement dynamically checks if a method named clean_
exists in the form class and calls it if found. This method will add custom field validation.
For instance, let's say that in this contact form, we can only accept senders whose email domain ends with example.com
(don’t ask me why I chose this example).
We need to add a form method with a name that matches clean_%s" % name
, which means it should start with clean_
followed by the field’s name. In this case, clean_sender
:
from django import forms
class ContactForm(forms.Form):
subject = forms.CharField(max_length=100)
message = forms.CharField(widget=forms.Textarea)
sender = forms.EmailField()
cc_myself = forms.BooleanField(required=False)
def clean_sender(self): #method to validate the sender's email domain
sender = self.cleaned_data.get("sender")
allowed_domain = "example.com"
if sender and not sender.endswith(f"@{allowed_domain}"):
raise ValidationError(
f"Only '{allowed_domain}' email addresses are allowed."
)
return sender
The hasattr(self, "clean_%s" % name)
function checks whether the form instance (self
) has a method named clean_sender
. Remember, that from Python's point of view, a method is an attribute of the class.
If such a method exists, getattr(self, "clean_%s" % name)
retrieves and calls it. The returned value is then stored in self.cleaned_data[name]
, ensuring that any transformations or validations performed by the method are reflected in the cleaned data.
This mechanism allows Django forms to incorporate custom validation logic for specific fields.
With this, we finish the _clean_fields()
method, which is responsible for cleaning the fields.
Up to this moment:
- All individual fields have been validated and their value stored in
cleaned_data
. - Or if an error was raised, it was added to
_errors
.
The next step is to validate the form. And that is what happens next in full_clean()
, which is done by calling _clean_form()
.
B) full_clean(self)
→ _clean_form()
What does it mean to validate the form?
It's when you perform any extra form-wide cleaning after Field.clean()
has been called on every field.
full_clean(self)
calls _clean_form()
.
# django/forms/forms.py
def _clean_form(self):
try:
cleaned_data = self.clean()
except ValidationError as e:
self.add_error(None, e)
else:
if cleaned_data is not None:
self.cleaned_data = cleaned_data
This method is part of Django’s internal form validation process. It is responsible for calling the form’s clean()
method and handling any validation errors that arise.
The line cleaned_data = self.clean()
calls the clean()
method of the form (which can be overridden by the user, this is very important and we'll see it in a moment).
clean()
is meant for form-wide validation, where you validate multiple fields together. If clean()
raises a ValidationError
, execution jumps to the except block, where an error is added through add_error().
Let's look inside clean()
:
# django/forms/forms.py
def clean(self):
# Hook for doing any extra form-wide cleaning after Field.clean()
# has been called on every field. Any ValidationError raised by
# this method will not be associated with a particular field; it
# will have a special-case association with the field named '__all__'
return self.cleaned_data
It just returns cleaned_data
. What you can do is to have a custom form-wide validation between the fields of the form by overriding clean()
.
For instance, if the user wants a copy of the email, they must provide a valid sender email.
class ContactForm(forms.Form):
subject = forms.CharField(max_length=100)
message = forms.CharField(widget=forms.Textarea)
sender = forms.EmailField(initial="joe@email.com", required=False)
cc_myself = forms.BooleanField(required=False)
def clean(self):
cleaned_data = super().clean()
sender = cleaned_data.get("sender")
cc_myself = cleaned_data.get("cc_myself")
if cc_myself and not sender:
raise forms.ValidationError(
"You must provide a sender email if you want to CC yourself."
)
return cleaned_data
By overriding the clean()
method you can define and implement your own form-wide custom validation.
C) full_clean(self)
→ _post_clean()
And the last step in full_clean()
is to call _post_clean()
.
#django/forms/models.py
def _post_clean(self):
# An internal hook for performing additional cleaning after form
# cleaning is complete. Used for model validation in model forms.
pass
_post_clean()
is just a placeholder method (it does nothing).
However, in ModelForm
, _post_clean()
is overridden to handle additional model-specific validation, such as:
- Running model field validators
- Checking database constraints (e.g., unique fields)
The _post_clean()
method in BaseModelForm
is responsible for performing model-level validation after the form's fields have been cleaned. This is particularly relevant in ModelForm
, where the form is directly linked to a Django model. It ensures that ModelForms
handle both form validation and model validation correctly.
But we’re not going to dive into that now.
As a general summary, the Django validation path is something like this:
├── is_valid()
│ ├── self.is_bound
│ ├── self.errors
│ │ ├── full_clean()
│ │ │ ├── _clean_fields()
│ │ │ │ ├── _bound_items()
│ │ │ │ ├── _clean_bound_field()
│ │ │ │ │ ├── clean()
│ │ │ │ │ │ ├── to_python()
│ │ │ │ │ │ ├── validate()
│ │ │ │ │ │ ├── run_validators()
│ │ │ │ ├── clean_<field_name>
│ │ │ ├── _clean_form()
│ │ │ │ ├── clean()
│ │ │ ├── _post_clean()
Any comments, suggestions or questions?