Introduction to GRAph databases

Gabor Dobrei

June 1, 2016

Forrester

  • Over 25% of enterprises will be using graph databases by 2017.
  • Graph analysis is the true killer app for Big Data.
  • Graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data-driven operations and decisions after the design of data capture.
  • By the end of 2018, 70% of leading organizations will have one or more pilot or PoC efforts underway utilizing graph DBs.

GARTNER

$whoami

Gabor Dobrei

 

twitter/@gabor_dobrei

github/gabordobrei

(meetup/neo4j-budapest-users)

  • Nosql

  • why graphs?

  • what's a graph DB?

  • Pros/cons

nosql

(NEVER SQL?)

keyvalue
Document
relationaL



 

GRAPH

Volume (~size)

Density

(~complexity)

Volume (~size)

Density

(~complexity)

Key-Value

Volume (~size)

Density

(~complexity)

Key-Value

Column

Volume (~size)

Density

(~complexity)

Key-Value

Column

Document

Volume (~size)

Density

(~complexity)

Key-Value

Column

Document

RDBMS

Volume (~size)

Density

(~complexity)

Key-Value

Column

Document

RDBMS

Graph

Volume (~size)

Density

(~complexity)

Key-Value

Column

Document

RDBMS

Graph

90 % of use cases

GRAPHS ARE EVERYWHeRE

RELATIONSHIPS IN

  • Politics
  • Economics
  • History
  • Science
  • Transportation
  • Social Networks
  • Work, Communities

relationships are at least as important as the things they connect

Graphs = Whole > Σ parts

Graph:

Relationships are part of the

data

RDBMS:

Relationships part of the

fixed schema

Q&A

  • Complex Questions
  • Very expensive global searches/operations
  • Constant query time, regardless of data volume

A relational database may tell you the average age of everyone in this session,

but a graph database will tell you who is most likely to buy you a beer

Relational

foo

foo_bar

bar

Relational

foo

foo_bar

bar

Relational

foo

foo_bar

bar

Graph

Graph

Graph

Looks different, who cares?!

  • Sample social graph (~ 1000 ppl)
  • Avg 50 friends/person
  • pathExists(a, b) limited to depth 4
  • w/ cache to eliminate disk IO
# person query time
RDBMS 1 000 2 000 ms
GRAPH 1 000 2 ms
GRAPH 1000 000 2 ms

pros

  • Powerful data model, at least as general as RDBMS
  • Easy to "Whiteboard to data model"
  • Fast for highly connected data
  • Easy to query

cons

  • Sharding (reasonably well)
  • Requires conceptual shift

Introduction to graph databases

By Gábor Döbrei

Introduction to graph databases

  • 139