Neo4j
Graph Database
Graph Database?
- Made up of nodes (aka vertices/points) and relationships (aka edges)
Node
Relationship
Graph Database.
- Maps very well to many collections of data: social networks, hierarchies, etc.
- The conceptual data model is the only data model
- Relationships as a first-class concept
Neo4j
- Written in Java 😣 & Scala
- But! You never need to deal with Java
- Acts as a black-box server in practice, unless you are using a language on the JVM
- You do need to deal with the JVM and its appetite for resources
Neo4j
- Open source-ish (restricted enterprise features)
- Has a comprehensive HTTP REST API
- Awesome web-based console/data explorer
- The most widely and actively used graph database, lots of support available.
- Performance: don't ask me, I write javascript for a living
Neo4j: Negatives
- Some the of most important features you would want in production are hidden behind the prohibitively expensive enterprise license.
- Hot backups
- Ways to get around this...
- Clustering (Sharding & Replication)
- Hot backups
- JVM
- No database segmentation: no good way to share a database (but a few bad ways).
Neo4j as a graph db
- Nodes can be given one or more types ("labels").
(:beer:ipa { name: "Lervig Rye IPA"})
- Has indexing, constraints
- Supports various graph algorithms by default (shortest path, dijkstra)
- Has it's own query language (Cypher) for graph traversal, which has become reasonably mature
- Cypher is the star of the show
Setup:
wget http://neo4j.com/artifact.php?name=neo4j-co...
tar xf neo4j-community-2.3.1-unix.tar.gz
cd neo4j-community-2.3.1-unix
bin/neo4j start
Done!
- Caveat: doesn't work super well with upstart, but you weren't using that anyway right?
Usage example
Neo4j with Node.js (using the Seraph library which I maintain and co/es6)
let co = require('co');
let seraph = require('seraph/co');
let db = seraph('http://localhost:7474');
co(function *() {
let jon = yield db.save({ name: 'Jon Packer' }, 'person');
let brik = yield db.save({ name: 'BRIK Videobase AS' }, 'company');
let rel = yield db.relate(jon, 'works_at', brik, { for: '4 years' });
return { jon, brik, rel }
}).then(function(output) {
console.log(output);
});
Cypher
- Sort of like an SQL for graph databases.
- Except not completely insane
- So far the only implementation is Neo4j's, but they're working to change that
- Familiar if you've ever written SQL and code.
That example again...
This time in Cypher.
CREATE (jon:person { name: 'Jon' })
-[:works_at { for: '4 years' }]->
(brik:company { name: 'BRIK Videobase AS' })
* looks even better when you don't need to split it over 3 lines!
Cypher
- Declarative graph query language
- Before it, all the popular graph traversal methods were using imperative languages
- Reasonably simple language and syntax—borrows much from SQL for familiarity
Cypher vs. SQL
- Query to get brewery, beer and stock level of Lervig Rye IPA at Bergen Bystasjonen.
SQL:
SELECT *
FROM beers
INNER JOIN breweries ON breweries.brewery_id = beers.beer_id
INNER JOIN beer_stock ON beer_stock.beer_id = beers.beer_id
INNER JOIN stores ON beer_stock.store_id = stores.store_id
WHERE beers.beer_title = 'Lervig Rye IPA'
AND stores.store_name = 'Bergen, Bergen Storsenter Vinmonopol'
Cypher:
MATCH (ipa:beer { title: 'Lervig Rye IPA' })<-[:brews]-(lervig:brewery),
ipa-[stock:in_stock]->(store:store { name: 'Bergen, Bergen Storsenter Vinmonopol' })
RETURN *
Cypher: MATCH
- The MATCH statement starts a query/traversal and specifies a subset of the graph to start with
MATCH (ipa:beer { title: 'Lervig Rye IPA' })
identifier
label
predicate
- Could also be written with a WHERE, which gives more flexibility but worse performance
MATCH (ipa:beer)
WHERE ipa.title = 'Lervig Rye IPA'
OR ipa.title = 'Lervig Galaxy IPA'
RETURN ipa
Cypher: MATCH
- MATCH can specify many different types of nodes and relationships.
MATCH (:brewery)-[:brews]->(:beer)-[:brewed_in]->(:country)
relationship
directionality
- Like nodes, relationships can specify a predicate
MATCH (b:beer)-[stock:in_stock { quanitity: 25 }]-(s:store)
Cypher: WHERE
- WHERE must immediately follow a selector clause like MATCH, and further reduces that selection.
MATCH (beer:beer)-[:has_style]->(:style { name: 'India Pale Ale (IPA)' }),
beer-[stockLevel:in_stock]->(store:store)
WHERE store.name =~ 'Bergen.*'
AND beer.ratebeerWeightedAverage > 3.9
RETURN *
Cypher: CREATE
- CREATE will create a new pattern in the graph. Every node and relationship will be created, regardless of whether or not something similar already exists.
CREATE (beer:beer { title: 'I made this up' })
<-[:brews]-(:brewery { name: 'Monadic Ale' })
- If there was already a "Monadic Ale" brewery, now there's two...
Cypher: MERGE
- MERGE is like MATCH | CREATE, it will create the entire pattern if it does not find it in the graph
MERGE (beer:beer { title: 'I made this up' })
<-[:brews]-(:brewery { name: 'Monadic Ale' })
Cypher: MERGE
- MERGE can be used in conjuction with MATCH, to create or match part of a graph
MATCH (monadic:brewery { name: 'Monadic Ale' })
MERGE monadic-[:brews]->(beer:beer { title: 'Katajanjoulu' })
- The MERGE will only do something if the MATCH matched a :brewery
- If there is already a beer "Katajanjoulu" brewed by "Monadic Ale", this will do nothing
- If we didn't do the match first, and the brewery already existed, a duplicate brewery would be created: MERGE either matches the entire pattern or creates it.
Cypher: CREATE UNIQUE
- CREATE UNIQUE is the terser version of what we just did:
CREATE UNIQUE (:brewery { name: 'Monadic Ale' })
-[:brews]->(beer:beer { title: 'Katajanjoulu' })
- Assuming our Brewery already exists in the graph, a duplicate will not be created
- Only the parts that do not already exist will be created
- Will throw an error if there is ambiguity
Cypher: ..UD
- Various other commands exist that work in predictable ways, such as:
- SET - update a property
- REMOVE - remove a property
- DELETE - delete a node (error if relationships)
- DETACH DELETE - delete a node and relationships
Cypher: RETURN
- RETURN declares what will be output from your query
- Various transformations can be done on the data selected by your query before returning it
- Here's a few examples. Output is shown as JS objects read by Seraph in Node.
Cypher: COLLECT
- COLLECT aggregates many rows into a collection. This works particularly well for something like a one-to-many relationship:
MATCH (veholt:brewery { name: 'Veholt Mikrobryggeri' })
-[:brews]->(beer:beer)
RETURN veholt, COLLECT(beer.title) as beers
Results in
[ { veholt: { name: 'Veholt Mikrobryggeri', id: 3385 },
beers: [ 'Veholt Humlehelvete Double IPA Originalen',
'Veholt Jimmy Red' ] } ]
If we didn't use COLLECT:
[ { veholt: { name: 'Veholt Mikrobryggeri', id: 3385 },
beer: 'Veholt Humlehelvete Double IPA Originalen' },
{ veholt: { name: 'Veholt Mikrobryggeri', id: 3385 },
beer: 'Veholt Jimmy Red' } ]
Cypher: Lists
- Various functions exist for working with lists:
- EXTRACT (usually called map elsewhere)
- REDUCE
- FILTER
- These functions (COLLECT included) do not have to be used as a part of a RETURN clause, they can be used in various places in a query, for example in a WHERE.
Web Console
Finn.
Neo4j
By jonpacker
Neo4j
- 1,165