Exercising
Your
GraphDB
@darrickwiebe
dw@xnlogic.com
@GraphTO June 19, 2013
A Few Paradigms
BSP
Bare-metal graph APIs
SQL-Derived Languages
Stream Processing
BSP
You don't need this.
(until you've outgrown your first datacenter)
Neo4J Bare-Metal API
Neo4J's API is the best I've seen
Powerful graph algorithms
Very efficient
Complex to build on
Deeply tied to graph structure
Not composeable
Vendor-specific
SQL Derived Languages
They are *languages*
OrientSQL
SparQL
Cypher
Cypher
START theater =node:node_auto_index(theatre = 'Theatre Royal'), newcastle=node:node_auto_index(city = 'Newcastle'), bard =node:node_auto_index('firstname:William AND lastname:Shakespeare') MATCH (newcastle)<-[:IN*1..4]-(theater) <-[:VENUE]-(performance) -[:PERFORMED]->(play) <-[w:WROTE]-(bard) WHERE w.date > 1608 RETURN play
I Do Declare!
query by pattern
easy visualization
limited exploration of data
can't combine queries
vendor-specific
Limitations of Query Languages
injection attacks
not extensible
don't work with your tools
difficult to predict behaviour
not fun to generate
Tinkerpop
stream processing
Blueprints
Lightweight graph adapter API
PIPES
A whole talk in itself
Awesome
Incredibly flexible
Blueprints
Lightweight graph adapter API
Vendor agnostic
PIPES
A whole talk in itself
Awesome
Incredibly flexible
Not actually a query language
Turns Gremlin queries into map-reduce jobs
This is your big data fallback plan
graph.idx("node_auto_index")([city: "Newcastle"])
.in("IN").loop(1){it.loops <= 4}{it.loops >= 1}
.has("theatre", "Theatre Royal")
.in("VENUE").out("PERFORMED")
.inE("WROTE").has("date", T.gt, 1608)
.outV.has("firstname", "William").has("lastname", "Shakespeare")
.back(5)
==>v[6]
IT's Groovy Baby!
Gremlin console
Highly expressive
Dynamically construct queries
Query is just a composition of Pipe objects
Embeddable in Java projects
Focussed on performance
Pacer
Pacer
graph.v(city: 'Newcastle').
repeat(1..4) { |r| r.in(:IN) }.
filter(theatre: 'Theatre Royal').
in(:VENUE).out(:PERFORMED).
lookahead do |r|
r.in_e(:WROTE).where('date > 1608').
out_v(firstname: 'William', lastname: 'Shakespeare')
end
#<V[6] The Tempest>
Total: 1
Synthesis
Gremlin Pipes
Cypher
Neo4j Algorithms
Pacer's own features
More than a Query Language
Focussed on developer happiness
Powerful domain modelling
Does the right thing automatically
Streaming data processing
Irons out differences between GraphDBs
Easy to extend
Designed for the console
Does the Right Thing?
Won't dump massive results
Formats results nicely
Chooses the best index for you
Nests transactions safely
Mocks transactions if unsupported by DB
Type-specific operations in Route definitions
How do you Choose?
Fit for your Data
Fit for your Team
Capabilities
Vendor Lock-in
Cypher v. Pacer
START
theater=node:node_auto_index(theatre = 'Theatre Royal'),
newcastle =node:node_auto_index(city = 'Newcastle'),
bard=node:node_auto_index('firstname:William AND
lastname:Shakespeare')
MATCH
(newcastle)
<-[:IN*1..4]-(theater) <-[:VENUE]-(performance)
-[:PERFORMED]->(play) <-[w:WROTE]-(bard)
WHERE w.date > 1608
RETURN play
theatre = graph.v(theatre: 'Theatre Royal').first newcastle = graph.v(city: 'Newcastle').first bard = graph.v(firstname: 'William', lastname: 'Shakespeare')
newcastle.repeat(1..4) { |r| r.in(:IN) }. is(theatre).in(:VENUE).out(:PERFORMED). lookahead do |r| r.in_e(:WROTE).where('date > 1608'). out_v.is(bard) end
Gremlin v. Pacer
graph.idx("node_auto_index")([city: "Newcastle"])
.in("IN").loop(1){it.loops <= 4}{it.loops >= 1}
.has("theatre", "Theatre Royal")
.in("VENUE").out("PERFORMED")
.inE("WROTE").has("date", T.gt, 1608)
.outV.has("firstname", "William").has("lastname", "Shakespeare")
.back(5)
graph.v(city: 'Newcastle').
repeat(1..4) { |r| r.in(:IN) }.
filter(theatre: 'Theatre Royal').
in(:VENUE).out(:PERFORMED).
lookahead do |r|
r.in_e(:WROTE).where('date > 1608').
out_v(firstname: 'William', lastname: 'Shakespeare')
end
Cypher v. Pacer (Round 2)
START
crook = node:node_auto_index(name='Crook nr 1'),
atm= node:node_auto_index('name:ATM*')
MATCH
p = shortestPath(crook-[*..3]-atm)
RETURN
p
graph.v(name: 'Crook nr 1').path_to(graph.lucene('name:ATM*'))
ExercisingYourGraphDB
By Darrick Wiebe
ExercisingYourGraphDB
- 1,328