Python - Data Validation using Pydantic
Data validation is one of the most important parts of modern software development. Applications receive data from users, APIs, databases, forms, and external services. If the incoming data is incorrect, incomplete, or in the wrong format, the application may crash or produce unexpected results. Pydantic is a Python library designed to solve this problem by validating data automatically and converting it into proper Python data types.
Pydantic is widely used in web development, APIs, machine learning projects, automation systems, and backend applications. It is especially popular in frameworks like FastAPI because it simplifies data handling and improves reliability.
What is Pydantic?
Pydantic is a Python library that uses Python type hints to validate and parse data. It checks whether the provided data matches the expected type and structure.
For example, if a program expects an integer but receives a string, Pydantic can automatically convert it if possible or raise an error if the conversion fails.
Pydantic models are created using classes that inherit from BaseModel.
Installing Pydantic
Pydantic can be installed using pip:
pip install pydantic
After installation, it can be imported into Python programs.
from pydantic import BaseModel
Basic Example of Pydantic
from pydantic import BaseModel
class Student(BaseModel):
name: str
age: int
marks: float
student = Student(
name="Rahul",
age=20,
marks=88.5
)
print(student)
Output:
name='Rahul' age=20 marks=88.5
In this example:
-
namemust be a string -
agemust be an integer -
marksmust be a floating-point number
Pydantic validates the values automatically.
Automatic Type Conversion
One major advantage of Pydantic is automatic type conversion.
from pydantic import BaseModel
class Employee(BaseModel):
id: int
salary: float
employee = Employee(
id="101",
salary="45000.75"
)
print(employee)
Output:
id=101 salary=45000.75
Although the values were provided as strings, Pydantic converted them into the correct types.
Validation Errors
If the provided data is invalid, Pydantic raises a detailed error.
from pydantic import BaseModel
class Product(BaseModel):
name: str
price: float
product = Product(
name="Laptop",
price="abc"
)
Output:
ValidationError: value is not a valid float
This helps developers identify issues quickly.
Using Default Values
Pydantic allows fields to have default values.
from pydantic import BaseModel
class User(BaseModel):
username: str
active: bool = True
user = User(username="admin")
print(user)
Output:
username='admin' active=True
If active is not provided, the default value is used.
Optional Fields
Some fields may not always be required. Optional fields can be created using Optional.
from typing import Optional
from pydantic import BaseModel
class Customer(BaseModel):
name: str
phone: Optional[str] = None
customer = Customer(name="Anita")
print(customer)
Output:
name='Anita' phone=None
Field Constraints
Pydantic supports constraints such as minimum length, maximum length, and numeric limits.
from pydantic import BaseModel, Field
class Account(BaseModel):
username: str = Field(min_length=3, max_length=20)
age: int = Field(gt=18)
account = Account(
username="rohan",
age=25
)
print(account)
Explanation:
-
min_length=3means username must contain at least 3 characters -
max_length=20means username cannot exceed 20 characters -
gt=18means age must be greater than 18
String Validation
Pydantic can validate string formats such as emails and URLs.
from pydantic import BaseModel, EmailStr
class User(BaseModel):
name: str
email: EmailStr
user = User(
name="Kiran",
email="[email protected]"
)
print(user)
If the email format is incorrect, validation fails.
Nested Models
Pydantic supports nested data structures.
from pydantic import BaseModel
class Address(BaseModel):
city: str
pincode: int
class Person(BaseModel):
name: str
address: Address
person = Person(
name="Ravi",
address={
"city": "Mysore",
"pincode": 570001
}
)
print(person)
Nested models are useful when handling complex JSON data.
Working with Lists
Pydantic can validate lists and collections.
from typing import List
from pydantic import BaseModel
class Course(BaseModel):
subjects: List[str]
course = Course(
subjects=["Python", "Java", "C++"]
)
print(course)
If any list item has an invalid type, validation fails.
Custom Validators
Developers can create custom validation logic.
from pydantic import BaseModel, validator
class Employee(BaseModel):
name: str
@validator('name')
def name_must_be_uppercase(cls, value):
if value.upper() != value:
raise ValueError('Name must be uppercase')
return value
employee = Employee(name="RAHUL")
This validator checks whether the name is uppercase.
Parsing JSON Data
Pydantic easily converts JSON data into Python objects.
from pydantic import BaseModel
class Book(BaseModel):
title: str
price: float
json_data = '''
{
"title": "Python Basics",
"price": 299.50
}
'''
book = Book.parse_raw(json_data)
print(book)
This is extremely useful in API development.
Exporting Data
Pydantic models can be converted into dictionaries or JSON.
from pydantic import BaseModel
class Car(BaseModel):
brand: str
price: int
car = Car(
brand="Toyota",
price=1500000
)
print(car.dict())
print(car.json())
Output:
{'brand': 'Toyota', 'price': 1500000}
{"brand": "Toyota", "price": 1500000}
Pydantic in API Development
Pydantic is heavily used in API frameworks such as FastAPI.
When users send requests to an API, Pydantic validates the incoming request data automatically.
Example:
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI()
class User(BaseModel):
name: str
age: int
@app.post("/users/")
def create_user(user: User):
return user
Benefits include:
-
Automatic validation
-
Cleaner code
-
Better error handling
-
Improved API documentation
Advantages of Pydantic
Easy to Use
Pydantic uses standard Python type hints, making the syntax simple and readable.
Automatic Validation
Validation happens automatically without writing extra code.
Detailed Error Messages
Errors clearly describe what went wrong.
JSON Support
Pydantic works naturally with JSON data.
Better Code Quality
Validated data reduces runtime errors and improves application stability.
Faster Development
Developers spend less time writing manual validation logic.
Limitations of Pydantic
Learning Curve
Advanced features such as custom validators and nested models require practice.
Performance Overhead
Validation adds some processing time, especially for very large datasets.
Dependency on Type Hints
Developers must properly define type annotations.
Real-World Applications
Pydantic is used in many real-world scenarios:
-
API request validation
-
Configuration management
-
Data pipelines
-
Machine learning input validation
-
Form processing systems
-
Database data verification
-
Microservices communication
Difference Between Manual Validation and Pydantic
Manual Validation
if not isinstance(age, int):
raise ValueError("Age must be integer")
Pydantic Validation
class User(BaseModel):
age: int
Pydantic removes repetitive validation code and makes programs cleaner.
Conclusion
Pydantic is a powerful Python library for validating and managing data efficiently. It simplifies handling complex data structures, ensures data correctness, and reduces programming errors. By combining Python type hints with automatic validation, Pydantic allows developers to write cleaner, safer, and more maintainable applications.
As modern applications increasingly depend on APIs and structured data, Pydantic has become an essential tool in Python development. It is especially valuable for backend systems, web services, and applications where data accuracy and consistency are critical.