Python - Data Validation using Pydantic

Data validation is one of the most important parts of modern software development. Applications receive data from users, APIs, databases, forms, and external services. If the incoming data is incorrect, incomplete, or in the wrong format, the application may crash or produce unexpected results. Pydantic is a Python library designed to solve this problem by validating data automatically and converting it into proper Python data types.

Pydantic is widely used in web development, APIs, machine learning projects, automation systems, and backend applications. It is especially popular in frameworks like FastAPI because it simplifies data handling and improves reliability.

What is Pydantic?

Pydantic is a Python library that uses Python type hints to validate and parse data. It checks whether the provided data matches the expected type and structure.

For example, if a program expects an integer but receives a string, Pydantic can automatically convert it if possible or raise an error if the conversion fails.

Pydantic models are created using classes that inherit from BaseModel.

Installing Pydantic

Pydantic can be installed using pip:

pip install pydantic

After installation, it can be imported into Python programs.

from pydantic import BaseModel

Basic Example of Pydantic

from pydantic import BaseModel

class Student(BaseModel):
    name: str
    age: int
    marks: float

student = Student(
    name="Rahul",
    age=20,
    marks=88.5
)

print(student)

Output:

name='Rahul' age=20 marks=88.5

In this example:

  • name must be a string

  • age must be an integer

  • marks must be a floating-point number

Pydantic validates the values automatically.

Automatic Type Conversion

One major advantage of Pydantic is automatic type conversion.

from pydantic import BaseModel

class Employee(BaseModel):
    id: int
    salary: float

employee = Employee(
    id="101",
    salary="45000.75"
)

print(employee)

Output:

id=101 salary=45000.75

Although the values were provided as strings, Pydantic converted them into the correct types.

Validation Errors

If the provided data is invalid, Pydantic raises a detailed error.

from pydantic import BaseModel

class Product(BaseModel):
    name: str
    price: float

product = Product(
    name="Laptop",
    price="abc"
)

Output:

ValidationError: value is not a valid float

This helps developers identify issues quickly.

Using Default Values

Pydantic allows fields to have default values.

from pydantic import BaseModel

class User(BaseModel):
    username: str
    active: bool = True

user = User(username="admin")

print(user)

Output:

username='admin' active=True

If active is not provided, the default value is used.

Optional Fields

Some fields may not always be required. Optional fields can be created using Optional.

from typing import Optional
from pydantic import BaseModel

class Customer(BaseModel):
    name: str
    phone: Optional[str] = None

customer = Customer(name="Anita")

print(customer)

Output:

name='Anita' phone=None

Field Constraints

Pydantic supports constraints such as minimum length, maximum length, and numeric limits.

from pydantic import BaseModel, Field

class Account(BaseModel):
    username: str = Field(min_length=3, max_length=20)
    age: int = Field(gt=18)

account = Account(
    username="rohan",
    age=25
)

print(account)

Explanation:

  • min_length=3 means username must contain at least 3 characters

  • max_length=20 means username cannot exceed 20 characters

  • gt=18 means age must be greater than 18

String Validation

Pydantic can validate string formats such as emails and URLs.

from pydantic import BaseModel, EmailStr

class User(BaseModel):
    name: str
    email: EmailStr

user = User(
    name="Kiran",
    email="[email protected]"
)

print(user)

If the email format is incorrect, validation fails.

Nested Models

Pydantic supports nested data structures.

from pydantic import BaseModel

class Address(BaseModel):
    city: str
    pincode: int

class Person(BaseModel):
    name: str
    address: Address

person = Person(
    name="Ravi",
    address={
        "city": "Mysore",
        "pincode": 570001
    }
)

print(person)

Nested models are useful when handling complex JSON data.

Working with Lists

Pydantic can validate lists and collections.

from typing import List
from pydantic import BaseModel

class Course(BaseModel):
    subjects: List[str]

course = Course(
    subjects=["Python", "Java", "C++"]
)

print(course)

If any list item has an invalid type, validation fails.

Custom Validators

Developers can create custom validation logic.

from pydantic import BaseModel, validator

class Employee(BaseModel):
    name: str

    @validator('name')
    def name_must_be_uppercase(cls, value):
        if value.upper() != value:
            raise ValueError('Name must be uppercase')
        return value

employee = Employee(name="RAHUL")

This validator checks whether the name is uppercase.

Parsing JSON Data

Pydantic easily converts JSON data into Python objects.

from pydantic import BaseModel

class Book(BaseModel):
    title: str
    price: float

json_data = '''
{
    "title": "Python Basics",
    "price": 299.50
}
'''

book = Book.parse_raw(json_data)

print(book)

This is extremely useful in API development.

Exporting Data

Pydantic models can be converted into dictionaries or JSON.

from pydantic import BaseModel

class Car(BaseModel):
    brand: str
    price: int

car = Car(
    brand="Toyota",
    price=1500000
)

print(car.dict())
print(car.json())

Output:

{'brand': 'Toyota', 'price': 1500000}
{"brand": "Toyota", "price": 1500000}

Pydantic in API Development

Pydantic is heavily used in API frameworks such as FastAPI.

When users send requests to an API, Pydantic validates the incoming request data automatically.

Example:

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class User(BaseModel):
    name: str
    age: int

@app.post("/users/")
def create_user(user: User):
    return user

Benefits include:

  • Automatic validation

  • Cleaner code

  • Better error handling

  • Improved API documentation

Advantages of Pydantic

Easy to Use

Pydantic uses standard Python type hints, making the syntax simple and readable.

Automatic Validation

Validation happens automatically without writing extra code.

Detailed Error Messages

Errors clearly describe what went wrong.

JSON Support

Pydantic works naturally with JSON data.

Better Code Quality

Validated data reduces runtime errors and improves application stability.

Faster Development

Developers spend less time writing manual validation logic.

Limitations of Pydantic

Learning Curve

Advanced features such as custom validators and nested models require practice.

Performance Overhead

Validation adds some processing time, especially for very large datasets.

Dependency on Type Hints

Developers must properly define type annotations.

Real-World Applications

Pydantic is used in many real-world scenarios:

  • API request validation

  • Configuration management

  • Data pipelines

  • Machine learning input validation

  • Form processing systems

  • Database data verification

  • Microservices communication

Difference Between Manual Validation and Pydantic

Manual Validation

if not isinstance(age, int):
    raise ValueError("Age must be integer")

Pydantic Validation

class User(BaseModel):
    age: int

Pydantic removes repetitive validation code and makes programs cleaner.

Conclusion

Pydantic is a powerful Python library for validating and managing data efficiently. It simplifies handling complex data structures, ensures data correctness, and reduces programming errors. By combining Python type hints with automatic validation, Pydantic allows developers to write cleaner, safer, and more maintainable applications.

As modern applications increasingly depend on APIs and structured data, Pydantic has become an essential tool in Python development. It is especially valuable for backend systems, web services, and applications where data accuracy and consistency are critical.