Graph databases

An overview

Natalia Oskina

Zuhlke Engineering Ltd

Graph Databases

An overview

 

Natalia Oskina

Zuhlke Engineering

Title Text

Agenda

  • CAP?
  • What kind of a creature is Graph DB?
  • Graph DB vs relational DB
  • Is it a silver bullet? Downsides
  • Neo4J
  • Cypher vs SQL
  • Can we use it? Real Use cases
  • Demo

CAP theorem

  • Consistency - all nodes see the same data at the same time

Brewer'€™s theorem

  • Availability - every request receives a response about whether it succeeded or failed

 

  • Partition tolerance - the system continues to operate despite arbitrary partitioning due to network failures

What is a graph?

G = (V, E)

Graph database

  • If you would need to represent a system with all data of every movie ever made, how would you model it?
  • You need to cover movie names, actors, directors, budget, dates, costs, location, ratings, shows, tickets...

What about Bacon number?

Graph database

"No broken links"

  • Nodes
  • Relationships
  • Properties

Graph DB vs Relational DB

Graph DB vs Relational DB

Graph DB vs Relational DB

  • Chaotic data model. Irregualar, complex structure
  • Highly complex relationships between entities
  • Relationships are very different
  • Check only relationships of the node, not the whole database

Examples

  • Neo4J (Java, .Net, Python, JavaScript)
  • AllegroGraph (C#, C, Java, Lisp, Python)
  • Oracle Spatial and Graph ( Java, PI/SQL)
  • AngoDB(C, C++, JavaScript)

Neo4J

  • JVM Based
  • Billions of entities 
  • Support ACID
  • Using Cypher

Cypher

Functions like:     WHERE, ORDER BY, SKIP LIMIT, AND, p.unitPrice > 10

Select and Return Records

SQL

SELECT p.*

FROM products as p;

Cypher

MATCH (p:Product)

RETURN p;

Field Access, Ordering and Paging

SQL

SELECT p.ProductName, p.UnitPrice

FROM products as p

ORDER BY p.UnitPrice DESC

LIMIT 10;

Cypher

MATCH (p:Product)

RETURN p.productName, p.unitPrice

ORDER BY p.unitPrice DESC

LIMIT 10

Downsides

  • Use a lot of storage
  • Relational DB is much faster when operating a huge number of records
  • Licence is expensive

Panama Papers

https://offshoreleaks.icij.org/

Panama Papers

  • 950 000 nodes
  • 1,2 millions edges (4GB)

Facebook social graph

Demo

Questions?

Made with Slides.com