Graph Databases
Why?
- Problems
- Operational Excellence project
Problem: Our schema

... plus some join tables for which we don't have models
Problem
We have to explicitly create join tables and some models

NB: The number of records we search in this example (376k) isn't astronomical, and we do index them. However, keywords and RSAs have much greater volume
Problem
We have to explicitly include join tables in the query

Problem
It's worse when we apply hierarchy
NB: I haven't calculated the stats here but I suspect that also means a lot of volume and time

Solution?
Restrict the functionality. It's too hard to maintain!

Graph Databases
... might help
Definition
Very simply, a graph database is a database designed to treat the relationships between data as equally important to the data itself. It is intended to hold data without constricting it to a pre-defined model. Instead, the data is stored like we first draw it out - showing how each individual entity connects with or is related to others.
neo4j.com

Positive
- Good management of relationships
- Faster development of new hierarchy or targeting
- Simpler schema
- Has ORM
- Apache AGE Graph DB runs on Postgres
- Neo4j runs on GCP
Negative
- Graph DB support team is not Google (Postgres is)
- Apache AGE Graph DB PG extension is not included in GCP
- We don't know if there's a good community like PG
- Migration time
- Upskilling cost
... and probably a lot more to consider
Performance
- Efficient relationship traversal
- Constant time
- Independent of size
Options
https://en.wikipedia.org/wiki/Graph_database#List_of_graph_databases
- Neo4J
- ArangoDB
- Apache AGE
- ... others

Successfully used by ...

Lessons learned from ...

- DSC (Content, Content API, Newdam)
- Hosted the free community edition of neo4j themselves
- No clustering in the free edition
- They lacked experience in hosting neo4j
- Lesson 1: Compare PG cost to hosted neo4j
- Lesson 2: Consider cost of self-hosting
- Lesson 3: See if MongoDB Atlas suits
Next steps
- Compare our trickiest relationships to GraphDB
- Figure out if GraphDB will really help simplify
- Estimate project usage cost (PG cost €70k in 2021)
- Benchmarking
- Try out the ORM
- What else do you think would be useful to know?
Graph Databases
By Daniel Barlow
Graph Databases
- 21