Neo4j

Graph Database

Graph Database?

Made up of nodes (aka vertices/points) and relationships (aka edges)

Node

Relationship

Graph Database.

Maps very well to many collections of data: social networks, hierarchies, etc.
The conceptual data model is the only data model
Relationships as a first-class concept

Neo4j

Written in Java 😣 & Scala
- But! You never need to deal with Java
- Acts as a black-box server in practice, unless you are using a language on the JVM
- You do need to deal with the JVM and its appetite for resources

Neo4j

Open source-ish (restricted enterprise features)
Has a comprehensive HTTP REST API
Awesome web-based console/data explorer
The most widely and actively used graph database, lots of support available.
Performance: don't ask me, I write javascript for a living

Neo4j: Negatives

Some the of most important features you would want in production are hidden behind the prohibitively expensive enterprise license.
- Hot backups
  - Ways to get around this...
- Clustering (Sharding & Replication)
JVM
No database segmentation: no good way to share a database (but a few bad ways).

Neo4j as a graph db

Nodes can be given one or more types ("labels").

(:beer:ipa { name: "Lervig Rye IPA"})

Has indexing, constraints
Supports various graph algorithms by default (shortest path, dijkstra)
Has it's own query language (Cypher) for graph traversal, which has become reasonably mature
Cypher is the star of the show

Setup:

wget http://neo4j.com/artifact.php?name=neo4j-co...
tar xf neo4j-community-2.3.1-unix.tar.gz
cd neo4j-community-2.3.1-unix
bin/neo4j start

Done!

Caveat: doesn't work super well with upstart, but you weren't using that anyway right?

Usage example

Neo4j with Node.js (using the Seraph library which I maintain and co/es6)

let co = require('co');
let seraph = require('seraph/co');
let db = seraph('http://localhost:7474');

co(function *() {
  let jon = yield db.save({ name: 'Jon Packer' }, 'person');
  let brik = yield db.save({ name: 'BRIK Videobase AS' }, 'company');
  let rel = yield db.relate(jon, 'works_at', brik, { for: '4 years' });
  return { jon, brik, rel }
}).then(function(output) {
  console.log(output);
});

Cypher

Sort of like an SQL for graph databases.
- Except not completely insane
So far the only implementation is Neo4j's, but they're working to change that
Familiar if you've ever written SQL and code.

That example again...

This time in Cypher.

CREATE (jon:person { name: 'Jon' })
         -[:works_at { for: '4 years' }]->
         (brik:company { name: 'BRIK Videobase AS' })

* looks even better when you don't need to split it over 3 lines!

Cypher

Declarative graph query language
Before it, all the popular graph traversal methods were using imperative languages
Reasonably simple language and syntax—borrows much from SQL for familiarity

Cypher vs. SQL

Query to get brewery, beer and stock level of Lervig Rye IPA at Bergen Bystasjonen.

SQL:

SELECT *
FROM beers
INNER JOIN breweries ON breweries.brewery_id = beers.beer_id
INNER JOIN beer_stock ON beer_stock.beer_id = beers.beer_id
INNER JOIN stores ON beer_stock.store_id = stores.store_id
WHERE beers.beer_title = 'Lervig Rye IPA'
AND stores.store_name = 'Bergen, Bergen Storsenter Vinmonopol'

Cypher:

MATCH (ipa:beer { title: 'Lervig Rye IPA' })<-[:brews]-(lervig:brewery),
      ipa-[stock:in_stock]->(store:store { name: 'Bergen, Bergen Storsenter Vinmonopol' })
RETURN *

Cypher: MATCH

The MATCH statement starts a query/traversal and specifies a subset of the graph to start with

MATCH (ipa:beer { title: 'Lervig Rye IPA' })

identifier

label

predicate

Could also be written with a WHERE, which gives more flexibility but worse performance

MATCH (ipa:beer)
WHERE ipa.title = 'Lervig Rye IPA'
OR ipa.title = 'Lervig Galaxy IPA'
RETURN ipa

Cypher: MATCH

MATCH can specify many different types of nodes and relationships.

MATCH (:brewery)-[:brews]->(:beer)-[:brewed_in]->(:country)

relationship

directionality

Like nodes, relationships can specify a predicate

MATCH (b:beer)-[stock:in_stock { quanitity: 25 }]-(s:store)

Cypher: WHERE

WHERE must immediately follow a selector clause like MATCH, and further reduces that selection.

MATCH (beer:beer)-[:has_style]->(:style { name: 'India Pale Ale (IPA)' }),
      beer-[stockLevel:in_stock]->(store:store)
WHERE store.name =~ 'Bergen.*'
AND beer.ratebeerWeightedAverage > 3.9
RETURN *

Cypher: CREATE

CREATE will create a new pattern in the graph. Every node and relationship will be created, regardless of whether or not something similar already exists.

CREATE (beer:beer { title: 'I made this up' })
        <-[:brews]-(:brewery { name: 'Monadic Ale' })

If there was already a "Monadic Ale" brewery, now there's two...

Cypher: MERGE

MERGE is like MATCH | CREATE, it will create the entire pattern if it does not find it in the graph

MERGE (beer:beer { title: 'I made this up' })
        <-[:brews]-(:brewery { name: 'Monadic Ale' })

Cypher: MERGE

MERGE can be used in conjuction with MATCH, to create or match part of a graph

MATCH (monadic:brewery { name: 'Monadic Ale' })
MERGE monadic-[:brews]->(beer:beer { title: 'Katajanjoulu' })

The MERGE will only do something if the MATCH matched a :brewery
If there is already a beer "Katajanjoulu" brewed by "Monadic Ale", this will do nothing
If we didn't do the match first, and the brewery already existed, a duplicate brewery would be created: MERGE either matches the entire pattern or creates it.

Cypher: CREATE UNIQUE

CREATE UNIQUE is the terser version of what we just did:

CREATE UNIQUE (:brewery { name: 'Monadic Ale' })
   -[:brews]->(beer:beer { title: 'Katajanjoulu' })

Assuming our Brewery already exists in the graph, a duplicate will not be created
Only the parts that do not already exist will be created
Will throw an error if there is ambiguity

Cypher: ..UD

Various other commands exist that work in predictable ways, such as:
- SET - update a property
- REMOVE - remove a property
- DELETE - delete a node (error if relationships)
- DETACH DELETE - delete a node and relationships

Cypher: RETURN

RETURN declares what will be output from your query
Various transformations can be done on the data selected by your query before returning it
Here's a few examples. Output is shown as JS objects read by Seraph in Node.

Cypher: COLLECT

COLLECT aggregates many rows into a collection. This works particularly well for something like a one-to-many relationship:

MATCH (veholt:brewery { name: 'Veholt Mikrobryggeri' })
        -[:brews]->(beer:beer) 
RETURN veholt, COLLECT(beer.title) as beers

Results in

[ { veholt: { name: 'Veholt Mikrobryggeri', id: 3385 },
    beers: [ 'Veholt Humlehelvete Double IPA Originalen',
             'Veholt Jimmy Red' ] } ]

If we didn't use COLLECT:

[ { veholt: { name: 'Veholt Mikrobryggeri', id: 3385 },
    beer: 'Veholt Humlehelvete Double IPA Originalen' },
  { veholt: { name: 'Veholt Mikrobryggeri', id: 3385 },
    beer: 'Veholt Jimmy Red' } ]

Cypher: Lists

Various functions exist for working with lists:
- EXTRACT (usually called map elsewhere)
- REDUCE
- FILTER
These functions (COLLECT included) do not have to be used as a part of a RETURN clause, they can be used in various places in a query, for example in a WHERE.

Neo4j

Graph Database?

Graph Database.

Neo4j

Neo4j

Neo4j: Negatives

Neo4j as a graph db

Setup:

Usage example

Cypher

That example again...

Cypher

Cypher vs. SQL

Cypher: MATCH

Cypher: MATCH

Cypher: WHERE

Cypher: CREATE

Cypher: MERGE

Cypher: MERGE

Cypher: CREATE UNIQUE

Cypher: ..UD

Cypher: RETURN

Cypher: COLLECT

Cypher: Lists

Web Console

Finn.

Neo4j

Neo4j

jonpacker

Neo4j

Graph Database?

Graph Database.

Neo4j

Neo4j

Neo4j: Negatives

Neo4j as a graph db

Setup:

Usage example

Cypher

That example again...

Cypher

Cypher vs. SQL

Cypher: MATCH

Cypher: MATCH

Cypher: WHERE

Cypher: CREATE

Cypher: MERGE

Cypher: MERGE

Cypher: CREATE UNIQUE

Cypher: ..UD

Cypher: RETURN

Cypher: COLLECT

Cypher: Lists

Web Console

Finn.

Neo4j

More from jonpacker