Pydantic & Rust
Why and how Pydantic uses Rust
$whoami
- Python and Rust developer
- Created Pydantic in 2017
- Started a Company around Pydantic last year
What is Pydantic?
- Data validation & more using Python type hints
- Top 30 package on PyPI, >280M downloads / month
from datetime import datetime
from pydantic import BaseModel
class Delivery(BaseModel):
timestamp: datetime
dimensions: tuple[int, int]
m = Delivery(timestamp='2020-01-02T03:04:05Z', dimensions=['10', '20'])
print(repr(m.timestamp))
#> datetime.datetime(2020, 1, 2, 3, 4, 5, tzinfo=TzInfo(UTC))
print(m.dimensions)
#> (10, 20)
Pydantic V2
- Complete rewrite of Pydantic, with the core written in Rust
- Released in June 2023
- 5 - 50x faster than Pydantic V1
- More correct, more extensible
from datetime import datetime
from pydantic import BaseModel
class Delivery(BaseModel):
timestamp: datetime
dimensions: tuple[int, int]
m = Delivery(timestamp='2020-01-02T03:04:05Z', dimensions=['10', '20'])
print(repr(m.timestamp))
#> datetime.datetime(2020, 1, 2, 3, 4, 5, tzinfo=TzInfo(UTC))
print(m.dimensions)
#> (10, 20)
Rust Advantages
The obvious...
- Performance
- Reusing high quality rust libraries
- More explicit error handling
(maybe) Less obviously advantages:
- Virtually zero cost customisation, even in hot code
- Arguably easier to maintain - the compiler picks up more of mistake
- Private means private
Rust Advantages
Nested modular structures
from pydantic import BaseModel
class Qualification(BaseModel):
name: str
description: str
required: bool
value: int
class Student(BaseModel):
id: int
name: str
qualifications: list[Qualification]
friends: list[int]

[
...,
...,
...,
...,
...,
...,
...,
...,
...,
...,
...,
...,
]
continued...
What does that tree look like?
class Talk(BaseModel):
title: Annotated[
str,
Maxlen(100)
]
attendance: PosInt
when: datetime | None = None
mistakes: list[
tuple[timedelta, str]
]
ModelValidator {
cls: Talk,
validator: TypeDictValidator [
Field {
key: "title",
validator: StrValidator { max_len: 100 },
},
Field {
key: "attendance",
validator: IntValidator { min: 0 },
},
Field {
key: "when",
validator: UnionValidator [
DateTimeValidator {},
NoneValidator {},
],
default: None,
},
Field {
key: "mistakes",
validator: ListValidator {
item_validator: TupleValidator [
TimedeltaValidator {},
StrValidator {},
],
},
},
],
}
Rust Disdvantages
Disadvantages:
- Slower to develop
- Fewer people can help you
- Have to distribute binaries, or leave users to compile it
- Refactoring hell!
Rust Disdvantages
RecursionError is bad, but no RecursionError is worse!
Also no multiple ownership.
continued...
fn main() {
main();
}
from __future__ import annotations
from pydantic import BaseModel
class Foo(BaseModel):
a: int
f: list[Foo]
f = {'a': 1, 'f': []}
f['f'].append(f)
Foo(**f)
Pydantic V2 Architecture
Read type hints
construct a "core schema"
pydantic
(pure python)
pydantic-core
(binary + stubs + core-schema)
process core schema
return SchemaValidator
Receive input data
call .validate_python(data)
run validators
return the result of validation
Again for SchemaSerializer
Python Interface to Rust
class Talk(BaseModel):
title: Annotated[
str,
Maxlen(100)
]
attendance: PosInt
when: datetime | None = None
mistakes: list[
tuple[timedelta, str]
]
from pydantic_core import SchemaValidator
class Talk:
...
talk_validator = SchemaValidator({
'type': 'model',
'cls': Talk,
'schema': {
'type': 'model-fields',
'fields': {
'title': {'schema': {'type': 'str', 'max_length': 100}},
'attendance': {'schema': {'type': 'int', 'ge': 0}},
'when': {
'schema': {
'type': 'default',
'schema': {'type': 'nullable', 'schema': {'type': 'datetime'}},
'default': None,
}
},
'mistakes': {
'schema': {
'type': 'list',
'items_schema': {
'type': 'tuple',
'mode': 'positional',
'items_schema': [{'type': 'timedelta'}, {'type': 'str'}]
}
}
},
},
}
})
some_data = {
'title': "How Pydantic V2 leverages Rust's Superpowers",
'attendance': '100',
'when': '2024-10-22T19:15:00',
'mistakes': [
('00:00:00', 'Screen mirroring confusion'),
('00:00:30', 'Forgot to turn on the mic'),
('00:25:00', 'Too short'),
('00:40:00', 'Too long!'),
],
}
talk = talk_validator.validate_python(some_data)
print(talk.mistakes)
"""
[
(datetime.timedelta(0), 'Screen mirroring confusion'),
(datetime.timedelta(seconds=30), 'Forgot to turn on the mic'),
(datetime.timedelta(seconds=1500), 'Too short'),
(datetime.timedelta(seconds=2400), 'Too long!')
]
"""
Performance
import timeit
from pydantic import BaseModel, __version__
class Model(BaseModel):
name: str
age: int
friends: list[int]
settings: dict[str, float]
data = {
'name': 'John',
'age': 42,
'friends': list(range(200)),
'settings': {f'v_{i}': i / 2.0 for i in range(50)}
}
t = timeit.timeit(
'Model(**data)',
globals={'data': data, 'Model': Model},
number=10_000,
)
print(f'version={__version__} time taken {t * 100:.2f}us')
version=1.10.18 time taken 195.8us
version=2.9.2 time taken 4.08us
48.0x speedup
Not Rust vs. Python
But rather: Python as the user* interface for Rust.
(* by user, I mean "application developer")
I'd love to see a generation of libraries for Python (and other high level languages) built in Rust.
TLS
Routing
HTTP parsing
Validation
DB query
Serializing
Rust/C
Python
Application Logic
HTTPS request lifecycle:
100% of Developer time
=
1% of CPU cycles
...
Thank you
Alert!
We've launched Pydantic Logfire - pydantic.dev/logfire

Pydantic & Rust
By samuelcolvin-pydantic
Pydantic & Rust
Why and how Pydantic uses Rust
- 71