Handling Relationships in RESTful APIs
Django Edition
Me
- Grew up in Connecticut
- PhD in Math at University of Connecticut
- Have been in Salt Lake City working on Django based APIs since 2013
- Currently Director of Engineering at Teem.com

So you want to write an App
Goal:
- Fast, modern, web application
- You are thinking API driven Single Page App
Modern web options:
- REST or GraphQL?
- Python, Node, Golang, or Java?
- Angular, React, Ember, etc, etc
My defaults:
- Python, Django and Django-Rest-Framework, Angular or React
- If you are able to use Python 3 checkout apistar and pydantic
This talk focuses on the API design and making the App fast.
REST is Great
- Easy to understand
- Easy to explore
- Easy to expand
- Scales well
BUT ...
REST is not so Great
Get many resources in a single request
GraphQL queries access not just the properties of one resource but also smoothly follow references between them. While typical REST APIs require loading from multiple URLs, GraphQL APIs get all the data your app needs in a single request. Apps using GraphQL can be quick even on slow mobile network connections.
- Round Trip and Repeat Trip Times
- Over/Under Fetching
Spoiler: we don't need to give up REST to resolve these issues
Quick Blog Example
// /api/v1/authors
{
"id": 1,
"created_at": "2017-01-01T00:00:00Z",
"updated_at": "2017-03-11T13:04:00Z",
"displayName": "Jane Smith",
"email": "jane.smith@example.com",
"image_url": "https://example.com/jane.smith.jpg",
"recent_posts": [
90,
27,
23
]
}
// /api/v1/posts
{
"id": 90,
"title": "Fullstack doesn't scale!"
"content": "lorem ipsum ..."
"created_at": "2017-05-01T00:00:00Z",
"updated_at": "2017-05-01T00:00:00Z",
"published_at": "2017-05-02T00:00:00Z",
"author_id": 1
}Results in multiple requests!
Quick Blog Example 2
{
"id": 1,
"created_at": "2017-01-01T00:00:00Z",
"updated_at": "2017-03-11T13:04:00Z",
"displayName": "Jane Smith",
"email": "jane.smith@example.com",
"image_url": "https://example.com/jane.smith.jpg",
"recent_posts": [
{
"id": 90,
"title": "Fullstack doesn't scale!"
"content": "lorem ipsum ..."
"created_at": "2017-05-01T00:00:00Z",
"updated_at": "2017-05-01T00:00:00Z",
"published_at": "2017-05-02T00:00:00Z",
"author_id": 1
},
{
"id": 27,
"title": "Best micro-brews in LA"
"content": "lorem ipsum ..."
"created_at": "2017-03-11T00:00:00Z",
"updated_at": "2017-03-11T00:00:00Z",
"published_at": "2017-03-13T00:00:00Z",
"author_id": 1
},
]
} // /api/v1/authors
Quick Blog Example 3
[
{
"id": 90,
"title": "Fullstack doesn't scale!"
"content": "lorem ipsum ..."
"created_at": "2017-05-01T00:00:00Z",
"updated_at": "2017-05-01T00:00:00Z",
"published_at": "2017-05-02T00:00:00Z",
"author": {
"id": 1,
"created_at": "2017-01-01T00:00:00Z",
"updated_at": "2017-03-11T13:04:00Z",
"displayName": "Jane Smith",
"email": "jane.smith@example.com",
"image_url": "https://example.com/jane.smith.jpg",
}
},
{
"id": 27,
"title": "Best micro-brews in LA"
"content": "lorem ipsum ..."
"created_at": "2017-03-11T00:00:00Z",
"updated_at": "2017-03-11T00:00:00Z",
"published_at": "2017-03-13T00:00:00Z",
"author": {
"id": 1,
"created_at": "2017-01-01T00:00:00Z",
"updated_at": "2017-03-11T13:04:00Z",
"displayName": "Jane Smith",
"email": "jane.smith@example.com",
"image_url": "https://example.com/jane.smith.jpg",
}
}
] // /api/v1/posts?author_id=1A lot of repeated data!
It Doesn't Have to be this Way
EmberJS recommends a solution called sideloading.
https://guides.emberjs.com/v1.10.0/models/the-rest-adapter/#toc_sideloaded-relationships
Sideloading attempts to partially resolve these issues in REST
- Round trip and repeat trip times
- You can request the details of related objects to reduce trips to the API
- Over/Under fetching
- You can request related objects to avoid under fetching
- You only get related object details once in the API, partially avoiding over fetching
- You really need sparse fieldsets to solve this completely, see JSONAPI
Blog Revisited
{
"posts": [
{
"id": 90,
"title": "Fullstack doesn't scale!"
"content": "lorem ipsum ..."
"created_at": "2017-05-01T00:00:00Z",
"updated_at": "2017-05-01T00:00:00Z",
"published_at": "2017-05-02T00:00:00Z",
"author_id": 1
},
{
"id": 27,
"title": "Best micro-brews in LA"
"content": "lorem ipsum ..."
"created_at": "2017-03-11T00:00:00Z",
"updated_at": "2017-03-11T00:00:00Z",
"published_at": "2017-03-13T00:00:00Z",
"author_id": 1
}
],
"authors": [
{
"id": 1,
"created_at": "2017-01-01T00:00:00Z",
"updated_at": "2017-03-11T13:04:00Z",
"displayName": "Jane Smith",
"email": "jane.smith@example.com",
"image_url": "https://example.com/jane.smith.jpg"
}
]
} // /api/v1/posts?author_id=1&include[]=authorsOnly repeated data are ids
Sideloading at Teem
At Teem we have two models that we use constantly and we want to load some or all of the related objects:
- Users
- groups, organizations
- Rooms
- floors, buildings, campuses, calendars, room_resources
Sideloading at Teem
- We currently support sideloading for all new (v4+) APIs
- Most important reason for doing this:
It minimizes the number of API calls during a hard refresh of the Single-Page-App
Sideloading in DRF v1
class RoomViewSet(viewsets.ModelViewSet):
serializer_class = serializers.RoomSerializer
queryset = models.Room.objects.all()
resource_name = 'room'
resource_name_plural = 'rooms'
def list(self, request, **kwargs):
page = self.paginate_queryset(
self.filter_queryset(
self.get_queryset()
)
)
# base response will always contain the resource and the meta
response = {
self.resource_name_plural: self.get_serializer(
page, many=True).data,
}
response.update(self.get_sideload_data(request, page))
return Response(response)
def get_sideload_data(self, request, rooms):
if isinstance(rooms, models.Room):
rooms = [rooms]
data = {}
sideload_calendars = self.has_sideload_field('calendars')
sideload_licenses = self.has_sideload_field('licenses')
sideload_campuses = self.has_sideload_field('campuses')
sideload_buildings = self.has_sideload_field('buildings')
sideload_floors = self.has_sideload_field('floors')
sideload_room_resources = self.has_sideload_field('room_resources')
sideload_room_resource_categories = self.has_sideload_field(
'room_resource_categories')
sideload_cloud_files = self.has_sideload_field('cloud_files')
room_image_files = self.has_sideload_field('room_images')
serializer_context = self.get_serializer_context()
if sideload_licenses or sideload_campuses or sideload_buildings or \
sideload_floors or sideload_cloud_files or room_image_files:
licenses = set()
floor_ids = set()
cloud_files = []
room_images = set()
for room in rooms:
for l in room.licenses:
licenses.add(l)
if sideload_floors and room.floor_id:
floor_ids.add(room.floor_id)
if sideload_cloud_files:
cloud_files += list(room.cloud_files.all())
if room_image_files and room.room_image_id:
room_images.add(room.room_image)
if sideload_licenses:
data['licenses'] = LicenseSerializer(
instance=list(licenses),
context=serializer_context,
many=True).data
if sideload_floors or sideload_campuses or sideload_buildings:
floor_ids = list(floor_ids)
floors = models.Floor.objects.filter(
or_item_query([r.floor_id for r in rooms]))
if sideload_floors:
data['floors'] = serializers.FloorSerializer(
instance=floors,
context=serializer_context,
many=True).data
if sideload_buildings or sideload_campuses:
buildings = models.Building.objects.filter(
or_item_query([f.building_id for f in floors]))
if sideload_buildings:
data['buildings'] = serializers.BuildingSerializer(
instance=buildings,
many=True,
context=serializer_context).data
if sideload_campuses:
campuses = models.Campus.objects.filter(
or_item_query([b.campus_id for b in buildings]))
data['campuses'] = serializers.CampusSerializer(
instance=campuses,
many=True,
context=serializer_context).data
if sideload_room_resources:
# note that Django has already selected the room_resources for us
# because of the `prefetch_related` in the `get_queryset`
room_resources = set()
for r in rooms:
for x in r.room_resource.all():
room_resources.add(x)
data['room_resources'] = serializers.RoomResourceSerializer(
instance=list(room_resources),
context=serializer_context,
many=True).data
if sideload_room_resource_categories:
room_resource_categories = \
models.RoomResourceCategory.objects.filter(
roomresource__room__pk__in=[
r.id for r in rooms]).distinct()
data['room_resource_categories'] = serializers.\
RoomResourceCategorySerializer(
instance=room_resource_categories,
context=serializer_context,
many=True).data
if sideload_cloud_files:
cloud_files = list(set(cloud_files))
data['cloud_files'] = CloudFileSerializer(
instance=cloud_files,
context=serializer_context,
many=True).data
if room_image_files:
data['room_images'] = CloudFileSerializer(
instance=list(room_images),
context=serializer_context,
many=True).data
if sideload_calendars:
# note that Django has already selected the calendars for us
# because of the `select_related` in the `get_queryset`
data['calendars'] = CalendarSerializer(
instance=[r.calendar for r in rooms if r.calendar_id],
many=True,
context=serializer_context,
).data
return data* Code has been modified from its original version. It has been formatted to fit this screen
Sideloading in DRF v1 cont
class RoomViewSet(viewsets.ModelViewSet):
def get_sideload_data(self, request, rooms):
if isinstance(rooms, models.Room):
rooms = [rooms]
data = {}
sideload_calendars = self.has_sideload_field('calendars')
sideload_licenses = self.has_sideload_field('licenses')
sideload_campuses = self.has_sideload_field('campuses')
sideload_buildings = self.has_sideload_field('buildings')
sideload_floors = self.has_sideload_field('floors')
sideload_room_resources = self.has_sideload_field('room_resources')
sideload_room_resource_categories = self.has_sideload_field(
'room_resource_categories')
sideload_cloud_files = self.has_sideload_field('cloud_files')
room_image_files = self.has_sideload_field('room_images')
serializer_context = self.get_serializer_context()
if sideload_licenses or sideload_campuses or sideload_buildings or \
sideload_floors or sideload_cloud_files or room_image_files:
licenses = set()
floor_ids = set()
cloud_files = []
room_images = set()
for room in rooms:
for l in room.licenses:
licenses.add(l)
if sideload_floors and room.floor_id:
floor_ids.add(room.floor_id)
if sideload_cloud_files:
cloud_files += list(room.cloud_files.all())
if room_image_files and room.room_image_id:
room_images.add(room.room_image)
if sideload_licenses:
data['licenses'] = LicenseSerializer(
instance=list(licenses),
context=serializer_context,
many=True).data
if sideload_floors or sideload_campuses or sideload_buildings:
floor_ids = list(floor_ids)
floors = models.Floor.objects.filter(
or_item_query([r.floor_id for r in rooms]))
if sideload_floors:
data['floors'] = serializers.FloorSerializer(
instance=floors,
context=serializer_context,
many=True).data
if sideload_buildings or sideload_campuses:
buildings = models.Building.objects.filter(
or_item_query([f.building_id for f in floors]))
if sideload_buildings:
data['buildings'] = serializers.BuildingSerializer(
instance=buildings,
many=True,
context=serializer_context).data
if sideload_campuses:
campuses = models.Campus.objects.filter(
or_item_query([b.campus_id for b in buildings]))
data['campuses'] = serializers.CampusSerializer(
instance=campuses,
many=True,
context=serializer_context).data
if sideload_room_resources:
# note that Django has already selected the room_resources for us
# because of the `prefetch_related` in the `get_queryset`
room_resources = set()
for r in rooms:
for x in r.room_resource.all():
room_resources.add(x)
data['room_resources'] = serializers.RoomResourceSerializer(
instance=list(room_resources),
context=serializer_context,
many=True).data
if sideload_room_resource_categories:
room_resource_categories = \
models.RoomResourceCategory.objects.filter(
roomresource__room__pk__in=[
r.id for r in rooms]).distinct()
data['room_resource_categories'] = serializers.\
RoomResourceCategorySerializer(
instance=room_resource_categories,
context=serializer_context,
many=True).data
if sideload_cloud_files:
cloud_files = list(set(cloud_files))
data['cloud_files'] = CloudFileSerializer(
instance=cloud_files,
context=serializer_context,
many=True).data
if room_image_files:
data['room_images'] = CloudFileSerializer(
instance=list(room_images),
context=serializer_context,
many=True).data
if sideload_calendars:
# note that Django has already selected the calendars for us
# because of the `select_related` in the `get_queryset`
data['calendars'] = CalendarSerializer(
instance=[r.calendar for r in rooms if r.calendar_id],
many=True,
context=serializer_context,
).data
return dataSideloading in DRF v1 cont again
class RoomViewSet(viewsets.ModelViewSet):
def get_sideload_data(self, request, rooms):
# continued from previous slide...
if sideload_licenses:
data['licenses'] = LicenseSerializer(
instance=list(licenses),
context=serializer_context,
many=True).data
if sideload_floors or sideload_campuses or sideload_buildings:
floor_ids = list(floor_ids)
floors = models.Floor.objects.filter(
or_item_query([r.floor_id for r in rooms]))
if sideload_floors:
data['floors'] = serializers.FloorSerializer(
instance=floors,
context=serializer_context,
many=True).data
if sideload_buildings or sideload_campuses:
buildings = models.Building.objects.filter(
or_item_query([f.building_id for f in floors]))
if sideload_buildings:
data['buildings'] = serializers.BuildingSerializer(
instance=buildings,
many=True,
context=serializer_context).data
if sideload_campuses:
campuses = models.Campus.objects.filter(
or_item_query([b.campus_id for b in buildings]))
data['campuses'] = serializers.CampusSerializer(
instance=campuses,
many=True,
context=serializer_context).data
if sideload_room_resources:
# note that Django has already selected the room_resources for us
# because of the `prefetch_related` in the `get_queryset`
room_resources = set()
for r in rooms:
for x in r.room_resource.all():
room_resources.add(x)
data['room_resources'] = serializers.RoomResourceSerializer(
instance=list(room_resources),
context=serializer_context,
many=True).data
if sideload_room_resource_categories:
room_resource_categories = \
models.RoomResourceCategory.objects.filter(
roomresource__room__pk__in=[
r.id for r in rooms]).distinct()
data['room_resource_categories'] = serializers.\
RoomResourceCategorySerializer(
instance=room_resource_categories,
context=serializer_context,
many=True).data
if sideload_cloud_files:
cloud_files = list(set(cloud_files))
data['cloud_files'] = CloudFileSerializer(
instance=cloud_files,
context=serializer_context,
many=True).data
if room_image_files:
data['room_images'] = CloudFileSerializer(
instance=list(room_images),
context=serializer_context,
many=True).data
if sideload_calendars:
# note that Django has already selected the calendars for us
# because of the `select_related` in the `get_queryset`
data['calendars'] = CalendarSerializer(
instance=[r.calendar for r in rooms if r.calendar_id],
many=True,
context=serializer_context,
).data
return dataSideloading in DRF v2
class UserAPIViewset(SideloadViewSet):
queryset = models.User.objects.all()
serializer_class = serializers.UserSerializer
filter_backends = (
core_filters.IdFilter,
core_filters.BooleanFieldFilterFactory('is_active'),
core_filters.BooleanFieldFilterFactory('is_admin', 'is_ebadmin'),
core_filters.DateTimeFilterFactory('created_at'),
core_filters.DateTimeFilterFactory('updated_at'),
account_filters.GroupIdFilter,
)
resource_name = 'user'
resource_name_plural = 'users'
sideload_relations = {
'organizations': {
'serializer': serializers.CompanyInfoSerializer,
'field': 'company_id'
},
'groups': {
'serializer': serializers.GroupSerializer,
'manager': True,
'field': 'ebgroups'
},
'calendars': {
'serializer': 'calendars.drf.v4.serializers.CalendarSerializer',
'manager': True,
'field': 'calendar_set',
}
}Sideloading Implementation
class SideloadViewSet(viewsets.ModelViewset):
def get_sideload_data(self, request, resources):
resources = [resources] if isinstance(resources, Model) else resources
extra_response, context = {}, self.get_serializer_context()
for field in self.sideload_fields_to_show(request):
f = self.sideload_relations.get(field)
if f is None: continue
serializer = f['serializer']
if f.get('manager', False):
# The related objects are ManyToMany or a reverse ForeignKey
field_obj_ids = []
for x in resources:
field_obj_ids.extend(self.get_related_ids(x, f['field']))
else:
# The related object is a ForeignKey, a OneToOne, or a property
field_obj_ids = [getattr(x, f['field']) for x in resources]
# serialize the data
if f.get('include_archived', False):
qs = serializer.Meta.model.all_objects.filter(id__in=field_obj_ids)
else:
qs = serializer.Meta.model.objects.filter(id__in=field_obj_ids)
extra_response[field] = serializer(qs, context=context, many=True).data
return extra_responseWhere all the magic happens
Sideloading Implementation
class SideloadViewSet(viewsets.ModelViewSet):
resource_name = None
resource_name_plural = None
sideload_relations = {}
def __init__(self, *args, **kwargs):
"""Lazy load the sideload relation serializers."""
super(SideloadViewSet, self).__init__(*args, **kwargs)
self.validate_resource_name()
self.init_serializers()
def init_serializers(self):
"""Initializes special serializers, like the ones that sideload data."""
for field in self.sideload_relations:
self.sideload_relations[field]['serializer'] = \
self.get_sideload_serializer(field)
def validate_resource_name(self):
"""Validates that `resource_name` and `resource_name_plural` are set correctly."""
if self.resource_name is None:
raise self.ResourceNameException(
'You must set `resource_name` on the viewset.')
if self.resource_name_plural is None:
raise self.ResourceNameException(
'You must set `resource_name_plural` on the viewset.')Avoid circular import issues
Sideloading Implementation
# avoid circular imports etc
class SideloadViewSet(viewsets.ModelViewset):
def get_sideload_serializer(self, field):
fqp = self.sideload_relations.get(field, {}).get('serializer', '')
# it is already a serializer, return now
if isinstance(fqp, SerializerMetaclass):
return fqp
if not isinstance(fqp, str):
raise self.InvalidSideloadSerializer(
'Invalid serializer for {}'.format(field))
app, serializer = fqp.rsplit('.', 1)
try:
serializer = importlib.import_module(app).__dict__.get(serializer)
except ImportError:
raise self.NotImportableSerializer(
'Model path {} is not importable for'
' sideload_relation {}'.format(fqp, field)
)
return serializer
class UserAPIViewset(SideloadViewSet):
sideload_relations = {
'calendars': {
'serializer': 'calendars.drf.v4.serializers.CalendarSerializer',
'manager': True,
'field': 'calendar_set',
}
}Sideloading Implementation
# avoid circular imports etc
class SideloadViewSet(viewsets.ModelViewSet):
def init_serializers(self):
"""Initializes special serializers, like the ones that sideload data."""
for field in self.sideload_relations:
self.sideload_relations[field]['serializer'] = \
self.get_sideload_serializer(field)
def get_sideload_serializer(self, field):
"""
Handle importing the related serializers.
"""
fqp = self.sideload_relations.get(field, {}).get('serializer', '')
# it is already a serializer, return now
if isinstance(fqp, SerializerMetaclass):
return fqp
if not isinstance(fqp, str):
raise self.InvalidSideloadSerializer(
'Invalid serializer for {}'.format(field))
app, serializer = fqp.rsplit('.', 1)
try:
serializer = importlib.import_module(app).__dict__.get(serializer)
except ImportError:
raise self.NotImportableSerializer(
'Model path {} is not importable for'
' sideload_relation {}'.format(fqp, field)
)
return serializerWins
- This implementation significantly reduces boiler plate code
- Easy for any backend dev to utilize
- Time to build an API is almost completely determined by the effort to build the serializers (this is generally fast)
- Seems to cover all of our needs, we haven't had to extend it in awhile
Things to Consider/Improve
- Building the extra sideload data does require a nested loop.
- Fetching the related model ids for M2M or reverse relationships can result in multiple db queries. How can we do this in a single query?
- How does this work with relationships crossing services?
if f.get('manager', False):
# The related objects are ManyToMany or a reverse ForeignKey
field_obj_ids = []
for x in resources:
field_obj_ids.extend(self.get_related_ids(x, f['field']))for field in self.sideload_fields_to_show(request):
field_obj_ids = [getattr(x, f['field']) for x in resources]Thanks
Lucas Roesler (lucasroesler.com)
Handling Relationships in RESTful APIs
By Lucas Roesler
Handling Relationships in RESTful APIs
- 385