Django under the hood: How Form validation works ?

In this article we will try to understand what happens inside Django 5.2 when you try to validate a Django Form. Warning: we are going deep into the Django' internals. You better have Django's 5.2 source code at hand. At the end I'll provide a summary for the big picture. Let's say we have this contact form: from django import forms class ContactForm(forms.Form): subject = forms.CharField(max_length=100) message = forms.CharField(widget=forms.Textarea) sender = forms.EmailField() cc_myself = forms.BooleanField(required=False) Part of the Django's magic is validating a form's input raw data. The Django form validation process is a multi-step process that ensures the submitted data is clean and meets all defined validation rules. Here's a detailed, step-by-step breakdown: Form Initialization When a form instance is created with data, Django begins preparing it for validation: form = ContactForm(request.POST) # Pass submitted data into the form request.POST contains the submitted data, which is temporarily stored in form.data for validation. request.POST returns a QueryDict with the POST data: Where does form.data attribute come from? It's implemented in the BaseForm class: # django/forms/forms.py class BaseForm(RenderableFormMixin): """ The main implementation of all the Form logic... """ # Some lines are not shown for simplification purposes. def init( self, data=None, empty_permitted=False, ): self.is_bound = data is not None or files is not None self.data = MultiValueDict() if data is None else data self._errors = None # Stores the errors after clean() has been called. self.fields = copy.deepcopy(self.base_fields) When you do: form = ContactForm(request.POST) # populating self.data It's the same as: form = ContactForm(data=request.POST) # populating self.dataform That's how form.data is populated: self.data = MultiValueDict() if data is None else data MultiValueDict and QueryDict are specialized dictionaries in Django used for handling request data, such as form submissions (e.g., request.POST) and query parameters (e.g., request.GET). Unlike regular dictionaries that map each key to a single value, these structures allow multiple values to be associated with a single key. They provide convenient methods like getlist() to retrieve all values safely, which is especially useful for handling form fields with multiple selections (e.g., checkboxes or multi-select dropdowns). These classes help manage dynamic, multi-value data in HTTP requests while ensuring safe access without raising errors for missing keys. So self.data is populated with the request data or is set to a MultiValueDict. It will either hold the request data in the form of a MultiValueDict or an empty MultiValueDict if no data is passed in. But here you're also doing another very important stuff. A) Is the form bound? If the form has data (or file) is_bound returns True. You'll need this later in the verification process. self.is_bound = data is not None or files is not None B) How to get the fields of the form? Here is the key line: self.fields = copy.deepcopy(self.base_fields) base_fields is populated thanks to a metaclass. Django implements a metaclass that collects all Field instances declared in a class. It's a little bit complicated, but let's go step by step. You have the metaclass in django/forms/forms.py: # django/forms/forms.py class DeclarativeFieldsMetaclass(MediaDefiningClass): """Collect Fields declared on the base classes.""" # Some lines are not shown for simplification purposes. def new(mcs, name, bases, attrs): # Collect fields from current class and remove them from attrs. attrs["declared_fields"] = { key: attrs.pop(key) for key, value in list(attrs.items()) if isinstance(value, Field) } new_class = super().new(mcs, name, bases, attrs) new_class.base_fields = declared_fields return new_class Everytime a new Form is instantiated the metaclass runs, because the Form class inherits from it. # django/forms/forms.py class Form(BaseForm, metaclass=DeclarativeFieldsMetaclass): "A collection of Fields, plus their associated data." # This is a separate class from BaseForm in order to abstract the way # self.fields is specified. So when the metaclass runs, you get: new_class.base_fields = declared_fields Which translates later to: self.fields = copy.deepcopy(self.base_fields) What is inside fields? A dictionary containing the form's field names as keys and field objects as values: { 'subject': , 'message ': , 'sender ': , 'cc_myself': } So in summary, after initialization, you get: is_bound() returns True

Mar 28, 2025 - 23:15

Django under the hood: How Form validation works ?

In this article we will try to understand what happens inside Django 5.2 when you try to validate a Django Form.

Warning: we are going deep into the Django' internals. You better have Django's 5.2 source code at hand. At the end I'll provide a summary for the big picture.

Let's say we have this contact form:

from django import forms

class ContactForm(forms.Form):
    subject = forms.CharField(max_length=100)
    message = forms.CharField(widget=forms.Textarea)
    sender = forms.EmailField()
    cc_myself = forms.BooleanField(required=False)

Part of the Django's magic is validating a form's input raw data.

The Django form validation process is a multi-step process that ensures the submitted data is clean and meets all defined validation rules. Here's a detailed, step-by-step breakdown:

Form Initialization

When a form instance is created with data, Django begins preparing it for validation:

form = ContactForm(request.POST)  # Pass submitted data into the form

request.POST contains the submitted data, which is temporarily stored in form.data for validation.

request.POST returns a QueryDict with the POST data:

<QueryDict: {
'csrfmiddlewaretoken': ['L8vOV2Dj4qJZD...'],
'subject': ['This is the subject'],
'message': ['This is the message'],
'sender': ['email@email.com'], 
'cc_myself': ['False']
}>

Where does `form.data` attribute come from?

It's implemented in the BaseForm class:

# django/forms/forms.py
class BaseForm(RenderableFormMixin):
    """
    The main implementation of all the Form logic... 
    """

    # Some lines are not shown for simplification purposes.

    def __init__(
        self,
        data=None,
        empty_permitted=False,

    ):
        self.is_bound = data is not None or files is not None
        self.data = MultiValueDict() if data is None else data     
        self._errors = None  # Stores the errors after clean() has been called.
        self.fields = copy.deepcopy(self.base_fields)

When you do:

form = ContactForm(request.POST)  # populating self.data

It's the same as:

form = ContactForm(data=request.POST)  # populating self.dataform

That's how form.data is populated:

self.data = MultiValueDict() if data is None else data

MultiValueDict and QueryDict are specialized dictionaries in Django used for handling request data, such as form submissions (e.g., request.POST) and query parameters (e.g., request.GET). Unlike regular dictionaries that map each key to a single value, these structures allow multiple values to be associated with a single key. They provide convenient methods like getlist() to retrieve all values safely, which is especially useful for handling form fields with multiple selections (e.g., checkboxes or multi-select dropdowns). These classes help manage dynamic, multi-value data in HTTP requests while ensuring safe access without raising errors for missing keys.

So self.data is populated with the request data or is set to a MultiValueDict. It will either hold the request data in the form of a MultiValueDict or an empty MultiValueDict if no data is passed in.

But here you're also doing another very important stuff.

A) Is the form bound?

If the form has data (or file) is_bound returns True. You'll need this later in the verification process.

self.is_bound = data is not None or files is not None

B) How to get the fields of the form?

Here is the key line:

self.fields = copy.deepcopy(self.base_fields)

base_fields is populated thanks to a metaclass. Django implements a metaclass that collects all Field instances declared in a class.

It's a little bit complicated, but let's go step by step.

You have the metaclass in django/forms/forms.py:

# django/forms/forms.py
class DeclarativeFieldsMetaclass(MediaDefiningClass):
    """Collect Fields declared on the base classes."""

    # Some lines are not shown for simplification purposes.

    def __new__(mcs, name, bases, attrs):
        # Collect fields from current class and remove them from attrs.
        attrs["declared_fields"] = {
            key: attrs.pop(key)
            for key, value in list(attrs.items())
            if isinstance(value, Field)
        }

        new_class = super().__new__(mcs, name, bases, attrs)
        new_class.base_fields = declared_fields

        return new_class

Everytime a new Form is instantiated the metaclass runs, because the Form class inherits from it.

# django/forms/forms.py
class Form(BaseForm, metaclass=DeclarativeFieldsMetaclass):
    "A collection of Fields, plus their associated data."
    # This is a separate class from BaseForm in order to abstract the way
    # self.fields is specified.

So when the metaclass runs, you get:

new_class.base_fields = declared_fields

Which translates later to:

self.fields = copy.deepcopy(self.base_fields)

What is inside fields?

A dictionary containing the form's field names as keys and field objects as values:

{
    'subject': <django.forms.fields.CharField object at 0x007...>, 
    'message ': <django.forms.fields.CharField object at 0x005...>, 
    'sender ': <django.forms.fields.EmailField object at 0x002...>, 
    'cc_myself': <django.forms.fields.BooleanField object at 0x007...>
}

So in summary, after initialization, you get:

is_bound() returns True if there is data or file
data with raw input data from the form
fields with the fields of the form

We'll see _errors later on.

Validation Process

The validation process starts when you call is_valid() in the form. You typically instantiate the form within your view:

# views.py

def form_view(request):
    if request.method == "POST":
        form = ContactForm(request.POST) # Instantiate form
                                         # Pass submitted data
        if form.is_valid(): # the validation process starts
            subject = form.cleaned_data["subject"]
            ...
            return render(request, "form.html", context)
        else:
            ... 
    return render(request, "form.html", context)

There are two things to validate:

Each field
The form itself

As stated before, all starts with is_valid():

# django/forms/forms.py
def is_valid(self):
        """Return True if the form has no errors, or False otherwise."""
        return self.is_bound and not self.errors

It checks if the form has been submitted and has no validation errors.

We have to test the condition, if both return True, you can continue working with the form.

Is there something familiar? Yes!!!
We already seen is_bound. If there is data or file, it will return True.

Now we have to check if there have been some validation errors by calling self.errors:

# django/forms/forms.py
@property
def errors(self):
        """Return an ErrorDict for the data provided for the form."""
        if self._errors is None:
            self.full_clean()
        return self._errors

It ensures the form is fully validated before returning the collected errors. If there is no error, the full_clean() method will be called, otherwise it returns self._errors, which is populated by calling add_error() when a validation error raises.

add_error() adds validation errors to self._errors, associating them with a specific field or as non-field errors.

# django/forms/forms.py
def add_error(self, field, error):
        """
        Update the content of `self._errors`.
        """    

        # Some lines are not shown for simplification purposes.

        for field, error_list in error.items():
            if field not in self.errors:
                if field != NON_FIELD_ERRORS and field not in self.fields:
                    raise ValueError(...)

                if field == NON_FIELD_ERRORS:
                    self._errors[field] = self.error_class(...)
                else:
                    self._errors[field] = self.error_class()

            self._errors[field].extend(error_list)

            if field in self.cleaned_data:
                del self.cleaned_data[field]

This are the specific lines where:

The error is added to _errors.
The field with an error is deleted from cleaned_data

self._errors[field].extend(error_list)
if field in self.cleaned_data:
    del self.cleaned_data[field]

Going back to self.errors: if no error is raised, full_clean() is called.

Full Clean

Here is where the validation actually starts. It will clean all of data and populate _errors and cleaned_data.

# django/forms/forms.py
def full_clean(self):
        """
        Clean all of self.data and populate self._errors and self.cleaned_data.
        """
        self._errors = ErrorDict(renderer=self.renderer)
        if not self.is_bound:  # Stop further processing.
            return
        self.cleaned_data = {}
        # If the form is permitted to be empty, and none of the form data
        # has changed from the initial data, short circuit any validation.
        if self.empty_permitted and not self.has_changed():
            return

        self._clean_fields()
        self._clean_form()
        self._post_clean()

Inside there are three methods called that make the cleaning: _clean_fields(), _clean_form() and _post_clean().

A) full_clean(self) → _clean_fields()

It processes and validates each form field, storing cleaned data and handling errors.

# django/forms/forms.py
def _clean_fields(self):
        for name, bf in self._bound_items():
            field = bf.field # This is the actual field (e.g. )
            try:
                self.cleaned_data[name] = field._clean_bound_field(bf)
                if hasattr(self, "clean_%s" % name):
                    value = getattr(self, "clean_%s" % name)()
                    self.cleaned_data[name] = value
            except ValidationError as e:
                self.add_error(name, e)

It loops through _bound_items()

This method adds an abstraction through BoundField objects that helps in rendering the field, among other things.

# django/forms/forms.py
def _bound_items(self):
        """Yield (name, bf) pairs, where bf is a BoundField object."""
        for name in self.fields:
            yield name, self[name]

BoundField link a form field to its data, widget, errors, and metadata for rendering and validation. A BoundField is an object that represents a form field that has been bound to the form (i.e., it is connected to the form’s data and ready for rendering). The BoundField is what allows you to access field data, validation, and rendering. A BoundField is a wrapper around a form field that allows the field to be bound to data in the context of a form.

_bound_items() returns a generator object, it iterates thourgh fields (a dictionary containing the form's field names as keys and field objects as values) and yields the tuple name, self[name].

If you iterate through _bound_items(), you get this:

('subject ', <django.forms.boundfield.BoundField object at 0x00...>)
('message ', <django.forms.boundfield.BoundField object at 0x00...>)
('sender', <django.forms.boundfield.BoundField object at 0x00...>)
('cc_myself', <django.forms.boundfield.BoundField object at 0x00...>)

A BoundField is created dynamically when a form field is accessed for the first time when doing self[name].

Remember this:
self is a Django Form instance, which means self[name] triggers __getitem__() in Django's BaseForm class. Inside BaseForm, self[name] creates a BoundField on demand. If form is a form instance, when you do form[name] you're accessing a BoundField object which is rendered like this:

 type="text" name="subject" value="the subject" maxlength="100" required id="id_subject">

Let's go back to _clean_fields(), the field is stored in the field variable: field = bf.field

field is the field of the BoundField object: in the case of a CharField.

The next step is calling _clean_bound_field()

_clean_bound_field() is a field method. It receives bf as an argument, which is a BoundField instance object (it's validating a field, not the form).

# django/forms/fields.py
def _clean_bound_field(self, bf):
        value = bf.initial if self.disabled else bf.data
        return self.clean(value)

It determines the value to validate (using the initial value if the field is disabled or the submitted data otherwise), it stores it in value and then validates and cleans it using clean(value).

Remember that in this context self is a field instance and not a form instance: _clean_bound_field calls and returns the clean method of the field instance.

⫸ clean

# django/forms/fields.py
def clean(self, value):
        """
        Validate the given value and return its "cleaned" value as an
        appropriate Python object. Raise ValidationError for any errors.
        """
        value = self.to_python(value)
        self.validate(value)
        self.run_validators(value)
        return value

→ to_python(value)

Is the first step inside clean(), and it is where Django converts the raw string value (field's raw input data) into the appropriate Python data type.

The implementation of this method will depend on the data type. For instance, let's review subject = forms.CharField(max_length=100).

# django/forms/fields.py
class CharField(Field):
    def __init__(
        self, *, max_length=None, min_length=None, empty_value="", **kwargs
    ):
        self.max_length = max_length
        self.min_length = min_length
        self.empty_value = empty_value
        if min_length is not None:
            self.validators.append(validators.MinLengthValidator(int(min_length)))
        if max_length is not None:
            self.validators.append(validators.MaxLengthValidator(int(max_length)))        

    def to_python(self, value):
        """Return a string."""
        if value not in self.empty_values:
            value = str(value)
        return value

It first makes some validations when the CharField is instantiated.
For instance, max_length is equal to 100, which is validated by:

if max_length is not None:
    self.validators.append(validators.MaxLengthValidator(int(max_length)))

And to_python() checks if it's in empty_values, which is (None, "", [], (), {}). It then converts the raw string value to a python string with value = str(value).

The implementation is different between field types.
Just to have a different perspective, this is the implementation of IntegerField:

# django/forms/fields.py
class IntegerField(Field):
    ...
    def to_python(self, value):
        """
        Validate that int() can be called on the input. Return the result
        of int() or None for empty values.
        """
        value = super().to_python(value)
        if value in self.empty_values:
            return None
        if self.localize:
            value = formats.sanitize_separators(value)
        # Strip trailing decimal and zeros.
        try:
            value = int(self.re_decimal.sub("", str(value)))
        except (ValueError, TypeError):
            raise ValidationError(self.error_messages["invalid"], code="invalid")
        return value

→ validate(value)

Checks if value is empty and the field is required; if so, it raises a ValidationError.

# django/forms/fields.py
def validate(self, value):
        if value in self.empty_values and self.required:
            raise ValidationError(self.error_messages["required"], code="required")

→ run_validators(value)

Runs all field validators on the value and raises a ValidationError if any validator fails or returns None if the value is in empty_values.

# django/forms/fields.py
def run_validators(self, value):
        if value in self.empty_values:
            return
        errors = []
        for v in self.validators:
            try:
                v(value)
            except ValidationError as e:
                if hasattr(e, "code") and e.code in self.error_messages:
                    e.message = self.error_messages[e.code]
                errors.extend(e.error_list)
        if errors:
            raise ValidationError(errors)

After clean() we have a clean and validated value in the form of an appropriate Python data type.

Let's go back to _clean_fields(): the clean value is stored in cleaned_data[name] as a dictionary, that's why you can access validated data with form.cleaned_data.get("field_name"). If there is an error it will be added to _errors by calling add_error().

But we still have one extra validation to check inside the if statement.

# django/forms/forms.py
def _clean_fields(self):
        for name, bf in self._bound_items():
            field = bf.field
            try:
                self.cleaned_data[name] = field._clean_bound_field(bf)
                if hasattr(self, "clean_%s" % name):
                    value = getattr(self, "clean_%s" % name)()
                    self.cleaned_data[name] = value
            except ValidationError as e:
                self.add_error(name, e)

This code inside the if statement dynamically checks if a method named clean_ exists in the form class and calls it if found. This method will add custom field validation.

For instance, let's say that in this contact form, we can only accept senders whose email domain ends with example.com (don’t ask me why I chose this example).

We need to add a form method with a name that matches clean_%s" % name, which means it should start with clean_ followed by the field’s name. In this case, clean_sender:

from django import forms

class ContactForm(forms.Form):
    subject = forms.CharField(max_length=100)
    message = forms.CharField(widget=forms.Textarea)
    sender = forms.EmailField()
    cc_myself = forms.BooleanField(required=False)

    def clean_sender(self): #method to validate the sender's email domain
        sender = self.cleaned_data.get("sender")
        allowed_domain = "example.com"

        if sender and not sender.endswith(f"@{allowed_domain}"):
            raise ValidationError(
                f"Only '{allowed_domain}' email addresses are allowed."
            )
        return sender

The hasattr(self, "clean_%s" % name) function checks whether the form instance (self) has a method named clean_sender. Remember, that from Python's point of view, a method is an attribute of the class.

If such a method exists, getattr(self, "clean_%s" % name) retrieves and calls it. The returned value is then stored in self.cleaned_data[name], ensuring that any transformations or validations performed by the method are reflected in the cleaned data.

This mechanism allows Django forms to incorporate custom validation logic for specific fields.

With this, we finish the _clean_fields() method, which is responsible for cleaning the fields.

Up to this moment:

All individual fields have been validated and their value stored in cleaned_data.
Or if an error was raised, it was added to _errors.

The next step is to validate the form. And that is what happens next in full_clean(), which is done by calling _clean_form().

B) full_clean(self) → _clean_form()

What does it mean to validate the form?
It's when you perform any extra form-wide cleaning after Field.clean() has been called on every field.

full_clean(self) calls _clean_form().

# django/forms/forms.py
def _clean_form(self):
        try:
            cleaned_data = self.clean()
        except ValidationError as e:
            self.add_error(None, e)
        else:
            if cleaned_data is not None:
                self.cleaned_data = cleaned_data

This method is part of Django’s internal form validation process. It is responsible for calling the form’s clean() method and handling any validation errors that arise.

The line cleaned_data = self.clean() calls the clean() method of the form (which can be overridden by the user, this is very important and we'll see it in a moment).

clean() is meant for form-wide validation, where you validate multiple fields together. If clean() raises a ValidationError, execution jumps to the except block, where an error is added through add_error().

Let's look inside clean():

# django/forms/forms.py
def clean(self):        
        # Hook for doing any extra form-wide cleaning after Field.clean() 
        # has been called on every field. Any ValidationError raised by 
        # this method will not be associated with a particular field; it 
        # will have a special-case association with the field named '__all__'

        return self.cleaned_data

It just returns cleaned_data. What you can do is to have a custom form-wide validation between the fields of the form by overriding clean().

For instance, if the user wants a copy of the email, they must provide a valid sender email.

class ContactForm(forms.Form):
    subject = forms.CharField(max_length=100)
    message = forms.CharField(widget=forms.Textarea)
    sender = forms.EmailField(initial="joe@email.com", required=False)
    cc_myself = forms.BooleanField(required=False)

    def clean(self):
        cleaned_data = super().clean()
        sender = cleaned_data.get("sender")
        cc_myself = cleaned_data.get("cc_myself")

        if cc_myself and not sender:
            raise forms.ValidationError(
                "You must provide a sender email if you want to CC yourself."
            )

        return cleaned_data

By overriding the clean() method you can define and implement your own form-wide custom validation.

C) full_clean(self) → _post_clean()

And the last step in full_clean() is to call _post_clean().

#django/forms/models.py
def _post_clean(self):        
        # An internal hook for performing additional cleaning after form 
        # cleaning is complete. Used for model validation in model forms.
        pass

_post_clean() is just a placeholder method (it does nothing).

However, in ModelForm, _post_clean() is overridden to handle additional model-specific validation, such as:

Running model field validators
Checking database constraints (e.g., unique fields)

The _post_clean() method in BaseModelForm is responsible for performing model-level validation after the form's fields have been cleaned. This is particularly relevant in ModelForm, where the form is directly linked to a Django model. It ensures that ModelForms handle both form validation and model validation correctly.

But we’re not going to dive into that now.

As a general summary, the Django validation path is something like this:

├── is_valid()
│   ├── self.is_bound
│   ├── self.errors
│   │   ├── full_clean()
│   │   │   ├── _clean_fields()
│   │   │   │   ├── _bound_items()
│   │   │   │   ├── _clean_bound_field()
│   │   │   │   │   ├── clean()             
│   │   │   │   │   │   ├── to_python()
│   │   │   │   │   │   ├── validate()
│   │   │   │   │   │   ├── run_validators()
│   │   │   │   ├── clean_<field_name>
│   │   │   ├── _clean_form()
│   │   │   │   ├── clean()
│   │   │   ├── _post_clean()

Any comments, suggestions or questions?