Graph Databases

Why?

  • Problems
  • Operational Excellence project

Problem: Our schema

... plus some join tables for which we don't have models

Problem

We have to explicitly create join tables and some models

NB: The number of records we search in this example (376k) isn't astronomical, and we do index them. However, keywords and RSAs have much greater volume

Problem

We have to explicitly include join tables in the query

Problem

It's worse when we apply hierarchy

NB: I haven't calculated the stats here but I suspect that also means a lot of volume and time

Solution?

Restrict the functionality. It's too hard to maintain!

Graph Databases

... might help

Definition

Very simply, a graph database is a database designed to treat the relationships between data as equally important to the data itself. It is intended to hold data without constricting it to a pre-defined model. Instead, the data is stored like we first draw it out - showing how each individual entity connects with or is related to others.

neo4j.com

Positive

  • Good management of relationships
  • Faster development of new hierarchy or targeting
  • Simpler schema
  • Has ORM
  • Apache AGE Graph DB runs on Postgres
  • Neo4j runs on GCP

Negative

  • Graph DB support team is not Google (Postgres is)
  • Apache AGE Graph DB PG extension is not included in GCP
  • We don't know if there's a good community like PG
  • Migration time
  • Upskilling cost

... and probably a lot more to consider

Performance

  • Efficient relationship traversal
  • Constant time
  • Independent of size

Options

https://en.wikipedia.org/wiki/Graph_database#List_of_graph_databases

  • Neo4J
  • ArangoDB
  • Apache AGE
  • ... others

Successfully used by ...

Lessons learned from ...

  • DSC (Content, Content API, Newdam)
  • Hosted the free community edition of neo4j themselves
  • No clustering in the free edition
  • They lacked experience in hosting neo4j
  • Lesson 1: Compare PG cost to hosted neo4j
  • Lesson 2: Consider cost of self-hosting
  • Lesson 3: See if MongoDB Atlas suits

Next steps

  • Compare our trickiest relationships to GraphDB
  • Figure out if GraphDB will really help simplify
  • Estimate project usage cost (PG cost €70k in 2021)
  • Benchmarking
  • Try out the ORM
  • What else do you think would be useful to know?

Graph Databases

By Daniel Barlow

Graph Databases

  • 21