GraphQL

GraphQL: A graph oriented way to think about and explore data

[without the data actually having to be in
~~a graph database~~ any particular structure,
database or format at all]

Let's look at an example first...

After all that hard work building a clean REST API, it's not turning into 🍝

....real fast

Sounds familiar?

It's not your fault

Building RESTful APIs is a great idea, but building applications on top of it... Well.. It's often no fun 😢

No matter what, you'll end up with one of these:

1. frontend view specific endpoints in your API

(frontend requirements leaking into the API code)

2. huge payloads with loads of data your view doesn't need

(overfetching)

3. many API requests per view

(latency and potentially a soft DDOS by your own clients/frontends)

Entering GraphQL

Promise

GraphQL lets you define what the data looks like, how it's connected and how to resolve it.

GraphQL is not

a database
bound to a specific programming language or architecture
a fixed and strict set of rules of how your data must be structured

GraphQL is...

Strictly typed definitions

Used for validation of data from the users

and

for validating data returned by the resolvers

Resolvers

Functions that fetch information for a field in the graph based on the parent object, arguments and context.

A typical setup of a single graph

Easy to start out, but hard to work with in a larger company since the single codebase has to be concerned with a lot of different things

A distributed graph

using Apollo federation
(formerly known as schema stiching)

...more work when getting started, but with each part of the graph knowing less about the whole picture. The advantage being better separation of concerns and that it's typically a lot easier to test.

10000m overview

Gateway resposible for exposing the composed schema + authentication
Subgraphs care about their own domain only and trust user data it gets in headers from the gateway

Gateway is in DMZ. Subgraphs on internal network.

API request response cycle

(login)

Unauthenticated user goes to the application
Application queries graphql (POST request) for information about the current user to check if logged in
Since the me query is declared by the auth subgraph the gateway request to the auth subgraph
The auth subgraph finds no user data in headers and responds to the gateway with a null value for the user data
GraphQL server responds with a null value (since the user isn't logged in)
User clicks the login-button and goes to https://graphql.remin.no/auth/login
Keycloak handles login and redirects back to gateway
Gateway set session cookie and redirects back to the application
Application queries graphql (POST request) for information about the current user to check if logged in
GraphQL server queries the subgraph like earlier, but this time it responds with user data (since the user is now logged in)

Request process flow

100% 🌈 and 🦄?

The good parts

No more front ↔ back end data dependencies
Easier to avoid over/under fetching
Clean abstraction of data backends
You can model a meaningful interface to your data, regardless of the underlying structures and systems
Unopinionated. Doesn't care how and where you get your data
Mature and with a good ecosystem (thanks Apollo)

The bad tricky parts

Graphs with expensive resolver functions that are open to unauthenticated users can be a bit hard to protect against abuse
Designing the data model is challenging – once a field is exposed in the schema, it shouldn't ever be removed
A distributed graph has more complexity than a simple REST API and the startup cost is higher.
If you're lacking a good way of looking at the logs of all apps at once, debugging a distributed infra can be tiresome.

Securing your graph

You don't want abuse. There's a few ways around it.

The poor mans version (cost limiting)
Apollo operations registry

The poor mans version

Set a fixed "cost" allowed for a single query
Specify cost on the different resolvers
Return an error if the cost is higher than allowed

Apollo operations registry

Upload allowed queries to the registry build-time and allow only those queries uploaded to the registry to be performed.

Note: requires a an active team subscription for Apollo

Limitations

With a federated graph you really want to avoid directives (field decorators) for the time being (not fully supported with Apollo federation ATM).

They work, but the directive implementation isn't shared across subgraphs and has to be implemented in every graph. Unpractical and prone to errors.
Take care in avoiding using memory based caching – makes scaling the graphs horizontally a lot more difficult. Stick to redis etc. that can be shared between instances of a subgraph

Correlation ids

A correlation ID is generated for all operations and is passed to subgraphs and [will be] returned if an error is returned. This way we can present the user with the correlation ID and they can give it to us when they get in touch

Monitoring and logging

Apollo Gateway supports the Open TELemetry (OTEL) standard for app telemetry that can be consumed and presented by tools such as Zipkin or Jaeger.

...the Strawberry Python GraphQL library also has a plugin for OTEL.

GraphQL introduction Remin

By Kristoffer Brabrand

GraphQL introduction Remin

Kristoffer Brabrand

Senior developer @ Behalf

github.com/kbrabrand

GraphQL

GraphQL: A graph oriented way to think about and explore data

Let's look at an example first...

After all that hard work building a clean REST API, it's not turning into 🍝

Sounds familiar?

It's not your fault

Building RESTful APIs is a great idea, but building applications on top of it... Well.. It's often no fun 😢

No matter what, you'll end up with one of these:

1. frontend view specific endpoints in your API

2. huge payloads with loads of data your view doesn't need

3. many API requests per view

Entering GraphQL

Promise

GraphQL lets you define what the data looks like, how it's connected and how to resolve it.

GraphQL is not

GraphQL is...

Strictly typed definitions

Resolvers

A typical setup of a single graph

A distributed graph

10000m overview

API request response cycle

Request process flow

100% 🌈 and 🦄?

The good parts

The bad tricky parts

Securing your graph

The poor mans version

Apollo operations registry

Limitations

Correlation ids

Monitoring and logging

GraphQL introduction Remin

More from Kristoffer Brabrand