Intro to Pydantic

 

Run-Time Type Checking For Your Dataclasses

Alexander Hultnér PyCon US, 2021

@ahultner

Alexander Hultnér

Founder of Hultnér Technologies and Papero.io

 

 

@ahultner

Outline

  • Quick refresher on python data classes
  • Pydantic introduction
    • Prior art
    • Minimal example from @dataclass
    • Runtime type-checking
    • JSON (de)serialization, JSONSchema
    • Validators
    • Constrained types
    • Framework Integration, Flask, FastAPI, Django & more
      • OpenAPI Specifications
      • Autogenerated tests
  • Cool features worth mentioning
  • Conclusion

@ahultner

Dataclasses

Let's start with a quick @dataclass-refresher.

I love waffles, don't everyone?
For our examples we'll use our imaginary café, "The Waffle Bistro" 🧇 🌟

from dataclasses import dataclass
from typing import Tuple

@dataclass
class Waffle:
    style: str
    toppings: Tuple[str, ...]
>>> Waffle("Swedish", ("chocolate sauce", "ham"))
Waffle(style='Swedish', 
       toppings=('chocolate sauce', 'ham'))

And now let's try it out

@ahultner

Dataclasses

Now we may want to constrain the toppings and styles we offer!
🥛 🍓 🟠  🍫

We offer a couple of cream based toppings, and a couple of  "dessert sauces"

from typing import Union
from enum import Enum

class Cream(str, Enum):
    whipped_cream = "whipped cream"
    ice_cream = "icecream"

class DessertSauce(str, Enum):
    cloudberry_jam = "cloudberry jam"
    raspberry_jam = "raspberry jam"
    choclate_sauce = "chocolate sauce"

Topping = Union[DessertSauce, Cream]

class WaffleStyle(str, Enum):
    swedish = "Swedish"
    belgian = "Belgian"


@dataclass
class Waffle:
    style: WaffleStyle
    toppings: Tuple[Topping, ...]
>>> Waffle("Swedish", ("chocolate sauce", "ham"))
Waffle(style='Swedish',
       toppings=('chocolate sauce', 'ham'))

 

Let's see what happens if we try to create a waffle with ham topping.

@ahultner

  • Python Library 🐍
  • Great documentation 📖
  • Data validation using python type annotations 🧹🧐
  • Runtime type enforcement 💯
  • User-friendly errors 👨‍💻
  • No convoluted syntax, pure pythonic classes
  • (De)serialisation
  • Predecessors 🏛
    • Dataclasses, attrs, marshmallow, valideer, ORM-libraries, etc.

Pydantic

Quick introduction


 

@ahultner

With dataclasses the types aren't enforced.

But in this case we'll lean on the shoulders of a giant, pydantic 🧹🐍🧐

from pydantic.dataclasses import dataclass
    
@dataclass
class Waffle:
    style: WaffleStyle
    toppings: Tuple[Topping, ...]
from pydantic import ValidationError
try:
    Waffle("Swedish", ("chocolate sauce", "ham"))
except ValidationError as err:
    print(err)

With that simple change we can see that our new instance of an unsupported waffle actually raises errors 🚫🚨

Additionally these errors are very readable!

2 validation errors for Waffle

toppings -> 1
  value is not a valid enumeration member; permitted: 'cloudberry jam', 'raspberry jam', 'chocolate sauce' (type=type_error.enum; enum_values=[<DessertSauce.cloudberry_jam: 'cloudberry jam'>, <DessertSauce.raspberry_jam: 'raspberry jam'>, <DessertSauce.choclate_sauce: 'chocolate sauce'>])

toppings -> 1
  value is not a valid enumeration member; permitted: 'whipped cream', 'icecream' (type=type_error.enum; enum_values=[<Cream.whipped_cream: 'whipped cream'>, <Cream.ice_cream: 'icecream'>])

Pydantic

Runtime type-checking, data class drop-in replacement


 

@ahultner

So let's try to create a valid waffle 🧇 ✅

Waffle(
  "Swedish", 
  (Cream.whipped_cream, "cloudberry jam")
)
Waffle(
  style=<WaffleStyle.swedish: 'Swedish'>, 
  toppings=(
    <Cream.whipped_cream: 'whipped cream'>,
    <DessertSauce.cloudberry_jam: 'cloudberry jam'>
  )
)

Pydantic

Runtime type-checking


 

@ahultner

Cloudberry jam 🟠 automatically parsed as a DessertSauce

Pydantic

BaseModel, JSON

So what about JSON? 🧑‍💻  

from pydantic import BaseModel


class Waffle(BaseModel):
    style: WaffleStyle
    toppings: Tuple[Topping, ...]

Dataclass dropin replacement is great for compability

  • Pydantic BaseModel does more!
  • (de)serialisation
  • First class JSON-support

 

Disclaimer: Pydantic is primarly a parsing library

@ahultner

Pydantic

JSON (de)serialisation

So what about JSON? 🧑‍💻  

Specify arguments using kwargs when using BaseModel

Waffle(
  style="Swedish", 
  toppings=(Cream.whipped_cream, "cloudberry jam")
)
Waffle(
  style=<WaffleStyle.swedish: 'Swedish'>, 
  toppings=(
    <Cream.whipped_cream: 'whipped cream'>,
    <DessertSauce.cloudberry_jam: 'cloudberry jam'>
  )
)
>>> _.json()
'{"style": "Swedish", "toppings": ["whipped cream", "cloudberry jam"]}'

We can now easily encode this object as JSON 👾

There's also built-in support for dict, pickle, immutable copy(). Pydantic will also (de)serialise subclasses. 🥒

@ahultner

Pydantic

JSON (de)serialisation

So what about JSON? 🧑‍💻  

Let's reconstruct our object from the JSON output 🏗

>>> Waffle.parse_raw('{"style": "Swedish", "toppings": ["whipped cream", "cloudberry jam"]}')
Waffle(
  style=<WaffleStyle.swedish: 'Swedish'>, 
  toppings=(
    <Cream.whipped_cream: 'whipped cream'>,
    <DessertSauce.cloudberry_jam: 'cloudberry jam'>
  )
)

We'll use the parse_raw(…) function

@ahultner

Pydantic

JSON (de)serialisation

So what about JSON? 🧑‍💻  

Errors raises a validation error, these can also be represented as JSON 🚨🚫🚧

try:
    Waffle(
      style=42, 
      toppings=(
        Cream.whipped_cream, "cloudberry jam"
      )
    )
except ValidationError as err:
    print(err.json())
[
  {
    "loc": [
      "style"
    ],
    "msg": "value is not a valid enumeration member; permitted: 'Swedish', 'Belgian'",
    "type": "type_error.enum",
    "ctx": {
      "enum_values": [
        "Swedish",
        "Belgian"
      ]
    }
  }
]

@ahultner

Pydantic

JSONSchema

JSONSchema can be exported directly from the model

 

Useful for external clients or to feed a Swagger/OpenAPI-spec 📜✅

>>> Waffle.schema()
{'title': 'Waffle',
 'type': 'object',
 'properties': {'style': {'$ref': '#/definitions/WaffleStyle'},
  'toppings': {'title': 'Toppings',
   'type': 'array',
   'items': {'anyOf': [{'$ref': '#/definitions/DessertSauce'},
     {'$ref': '#/definitions/Cream'}]}}},
 'required': ['style', 'toppings'],
 'definitions': {'WaffleStyle': {'title': 'WaffleStyle',
   'description': 'An enumeration.',
   'enum': ['Swedish', 'Belgian'],
   'type': 'string'},
  'DessertSauce': {'title': 'DessertSauce',
   'description': 'An enumeration.',
   'enum': ['cloudberry jam', 'raspberry jam', 'chocolate sauce'],
   'type': 'string'},
  'Cream': {'title': 'Cream',
   'description': 'An enumeration.',
   'enum': ['whipped cream', 'icecream'],
   'type': 'string'}}}

Caution: Pydantic uses draft 7 of JSONSchema, this is used in the just released OpenAPI 3.1 spec.

The still common 3.0.x spec uses draft 4.

@ahultner

Pydantic

Validators


 

That was the built-in validators.

But what about custom ones?

from pydantic import validator, root_validator
swedish_toppings = (
    DessertSauce.raspberry_jam, DessertSauce.cloudberry_jam,
)
belgian_toppings = (DessertSauce.choclate_sauce,)

class WaffleOrder(Waffle):

  # Root validators check the entire model
  @root_validator(pre=False)
  def check_style_topping(cls, values):
    style, toppings = values.get("style"), values.get("toppings")
    # Check swedish style
    if (style == WaffleStyle.swedish and 
      all(t in swedish_toppings for t in toppings if type(t) is DessertSauce)
    ): return values
      
    # Check belgian style
    if (style == WaffleStyle.belgian and 
      all(t in belgian_toppings for t in toppings if type(t) is DessertSauce)
    ): return values
    
    # Doesn't match any of our allowed styles
    raise ValueError(f"The Waffle Bistro doesn't sell this waffle.")
    
        
  # A validator looking at a single property
  @validator('toppings')
  def check_cream(cls, toppings):
    creams = [t for t in toppings if type(t) is Cream]
    if len(creams) > 1:
      raise ValueError(f"One cream allowed, given: {creams}")
    return toppings

We now want to add some custom business logic specific for

"The Waffle Bistro"

  • Jam for Swedish waffles 🇸🇪 🟠 🔴
  • Chocolate for Belgian waffles 🇧🇪 🍫
  • Either ice-cream or whipped cream 🍦 ⊕🥛

@ahultner

Pydantic

Validators


 

Now let's see if we create some invalid waffle orders 🧇 ⚠️🚨

try: 
  WaffleOrder(
    style="Swedish",
    toppings=["icecream", "whipped cream", "cloudberry jam"]
  )
except ValidationError as err:
  print(err)
2 validation errors for WaffleOrder
toppings
  We only allow for one cream topping, given: [<Cream.ice_cream: 'icecream'>, <Cream.whipped_cream: 'whipped cream'>] (type=value_error)
__root__
  'NoneType' object is not iterable (type=type_error)
try: 
  WaffleOrder(
   style="Swedish", 
   toppings=["icecream", "cloudberry jam", "chocolate sauce"]
  )
except ValidationError as err:
  print(err)
1 validation error for WaffleOrder
__root__
  The Waffle Bistro doesn't sell this waffle. (type=value_error)

@ahultner

Pydantic

Validators, functions(…)


 

Gosh these runtime type checkers are rather useful, but what about functions

 

Pydantic got you covered with @validate_arguments.

 

 

 

 

Still in beta, API may change, released 2020-04-18 in version 1.5

from pydantic import validate_arguments

# Validator on function
# Ensure valid waffles when making orders
@validate_arguments
def make_order(waffle: WaffleOrder):
    ...
try:
    make_order({
        "style":"Breakfast",
        "toppings":("whipped cream", "raspberry jam")
    })
except ValidationError as err:
    print(err)
2 validation errors for MakeOrder
waffle -> style
  value is not a valid enumeration member; permitted: 
  'Swedish', 'Belgian'
  (type=type_error.enum; enum_values=[
  	<WaffleStyle.swedish: 'Swedish'>, 
  	<WaffleStyle.belgian: 'Belgian'>])

waffle -> __root__
  The Waffle Bistro doesn't sell this waffle.
  (type=value_error)

@ahultner

Framework integration

Pydantic-driven APIs

 

Automatic OpenAPI-specs

Request/response validation

@ahultner

A puzzle piece that seems to fit everywhere

Automatic testing

Pydantic-driven APIs

 

@ahultner

Sufficiently advanced technology is indistinguisable from magic

FastAPI

Pydantic-driven APIs

 

  • Lean micro framework similar to flask
  • Automatic OpenAPI-specs
  • Tight integration with pydantic
  • Async ASGI
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

def make_order(waffle: WaffleOrder):
    # Business logic for making an order
    pass

def dispatch_order(waffle: WaffleOrder):
    # Hand over waffle to customer
    pass

# Deliver a waffle
@app.post("/delivery/waffle")
async def deliver_waffle_order(waffle: WaffleOrder):
    dispatch = dispatch_order(waffle)
    return dispatch

@app.post("/order/waffle")
async def order_waffle(waffle: WaffleOrder):
    order = make_order(waffle)
    return order

This is everything we need to create a small API around our models.

@ahultner

That's the beginning

But there is more…

 

That's it, a quick introduction to pydantic!


But this is just the tip of the iceberg 🗻 and I want to give you a hint about what more can be done.  


I'm not going to go into detail in any of this but feel free to ask me about it in the chat, on Twitter/LinkedIn or via email 💬📨

@ahultner

And more!

Cool features worth mentioning

 

  • Post 1.0, reached this milestone in 2019

  • Support for standard library types
  • Offer useful extra types for every day use
    • Email
    • HttpUrl (and more, stricturl for custom validation)
    • PostgresDsn
    • IPvAnyAddress (as well as IPv4Address and IPv6Address from ipaddress)
    • PositiveInt
    • PaymentCardNumber
    • PaymentCardBrand.[amex, mastercard, visa, other] checks luhn, str of digits and BIN-based lenght.
    • Constrained types (subtypes or conlist, conint, etc.)
    • Strict types, no coercion
    • and more…

@ahultner

Conclusion

  • Pure python syntax
  • Better validation
  • Very useful JSON-tools for API's
  • Easy to migrate from dataclasses
  • Lots of useful features
  • More things comming
    • ​Very active development
    • Working on strict mode
  • Try it out!

@ahultner

Questions

Contact me if you have any further

questions.

 

 

Want to learn more?

Available for training, workshops and

freelance consulting.

 

Sign up for Hypothesis course

Papero.io/beta

Links

@ahultner

PyCon US 2021: Intro to Pydantic – Run-Time Type Checking For Your Dataclasses

By Alexander Hultnér

PyCon US 2021: Intro to Pydantic – Run-Time Type Checking For Your Dataclasses

A talk about the fantastic pydantic library and how it makes your dataclasses better!

  • 1,734