Exercising

Your

GraphDB


@darrickwiebe

dw@xnlogic.com
@GraphTO  June 19, 2013

A Few Paradigms




BSP
Bare-metal graph APIs
SQL-Derived Languages
Stream Processing

BSP





You don't need this.




(until you've outgrown your first datacenter)

Neo4J Bare-Metal API



Neo4J's API is the best I've seen
Powerful graph algorithms
Very efficient

Complex to build on
Deeply tied to graph structure
Not composeable
Vendor-specific

SQL Derived Languages



They are *languages*


OrientSQL
SparQL
Cypher

Cypher


START
  theater  =node:node_auto_index(theatre = 'Theatre Royal'),
  newcastle=node:node_auto_index(city = 'Newcastle'),
  bard     =node:node_auto_index('firstname:William AND
                                  lastname:Shakespeare') 
MATCH
  (newcastle)<-[:IN*1..4]-(theater)
    <-[:VENUE]-(performance)
    -[:PERFORMED]->(play)
    <-[w:WROTE]-(bard) 
WHERE
  w.date > 1608 
RETURN play 


I Do Declare!


query by pattern
easy visualization


limited exploration of data


can't combine queries
vendor-specific

Limitations of Query Languages




injection attacks
not extensible
don't work with your tools
difficult to predict behaviour
not fun to generate



Tinkerpop

stream processing

Blueprints


Lightweight graph adapter API


PIPES


A whole talk in itself
Awesome
Incredibly flexible

Blueprints




Lightweight graph adapter API
Vendor agnostic



PIPES




A whole talk in itself
Awesome
Incredibly flexible







Not actually a query language

Turns Gremlin queries into map-reduce jobs

This is your big data fallback plan



  graph.idx("node_auto_index")([city: "Newcastle"])
    .in("IN").loop(1){it.loops <= 4}{it.loops >= 1}
    .has("theatre", "Theatre Royal")
    .in("VENUE").out("PERFORMED")
    .inE("WROTE").has("date", T.gt, 1608)
    .outV.has("firstname", "William").has("lastname", "Shakespeare")
    .back(5)
 

==>v[6]

IT's Groovy Baby!


Gremlin console
Highly expressive
Dynamically construct queries


Query is just a composition of Pipe objects


Embeddable in Java projects
Focussed on performance

Pacer

Pacer



  graph.v(city: 'Newcastle').
    repeat(1..4) { |r| r.in(:IN) }.
    filter(theatre: 'Theatre Royal').
    in(:VENUE).out(:PERFORMED).
    lookahead do |r|
      r.in_e(:WROTE).where('date > 1608').
      out_v(firstname: 'William', lastname: 'Shakespeare')
    end
 

#<V[6] The Tempest>
Total: 1

Synthesis




Gremlin Pipes
Cypher
Neo4j Algorithms
Pacer's own features

More than a Query Language


Focussed on developer happiness

Powerful domain modelling
Does the right thing automatically
Streaming data processing
Irons out differences between GraphDBs
Easy to extend
Designed for the console

Does the Right Thing?



Won't dump massive results
Formats results nicely
Chooses the best index for you
Nests transactions safely
Mocks transactions if unsupported by DB
Type-specific operations in Route definitions

How do you Choose?




Fit for your Data
Fit for your Team
Capabilities
Vendor Lock-in



Cypher v. Pacer

START
  theater=node:node_auto_index(theatre = 'Theatre Royal'),
  newcastle =node:node_auto_index(city = 'Newcastle'),
  bard=node:node_auto_index('firstname:William AND
                             lastname:Shakespeare') 
MATCH
  (newcastle)
    <-[:IN*1..4]-(theater)  <-[:VENUE]-(performance)
    -[:PERFORMED]->(play)   <-[w:WROTE]-(bard) 
WHERE w.date > 1608 
RETURN play

theatre = graph.v(theatre: 'Theatre Royal').first
newcastle = graph.v(city: 'Newcastle').first
bard = graph.v(firstname: 'William', lastname: 'Shakespeare')
newcastle.repeat(1..4) { |r| r.in(:IN) }.
  is(theatre).in(:VENUE).out(:PERFORMED).
  lookahead do |r|
    r.in_e(:WROTE).where('date > 1608').
    out_v.is(bard)
  end

Gremlin v. Pacer


  graph.idx("node_auto_index")([city: "Newcastle"])
    .in("IN").loop(1){it.loops <= 4}{it.loops >= 1}
    .has("theatre", "Theatre Royal")
    .in("VENUE").out("PERFORMED")
    .inE("WROTE").has("date", T.gt, 1608)
    .outV.has("firstname", "William").has("lastname", "Shakespeare")
    .back(5)
 


  graph.v(city: 'Newcastle').
    repeat(1..4) { |r| r.in(:IN) }.
    filter(theatre: 'Theatre Royal').
    in(:VENUE).out(:PERFORMED).
    lookahead do |r|
      r.in_e(:WROTE).where('date > 1608').
      out_v(firstname: 'William', lastname: 'Shakespeare')
    end
 







Cypher v. Pacer (Round 2)



  START
    crook = node:node_auto_index(name='Crook nr 1'),
    atm= node:node_auto_index('name:ATM*') 
  MATCH
    p = shortestPath(crook-[*..3]-atm) 
  RETURN
    p
 


  graph.v(name: 'Crook nr 1').path_to(graph.lucene('name:ATM*'))
 

ExercisingYourGraphDB

By Darrick Wiebe