Intro to Graph Databases
and Neo4j
data:image/s3,"s3://crabby-images/5a702/5a702021d2ac3f0b8a67e7dafe54a533fc4cbd45" alt=""
About Me
@dabernathy89
What's a graph?
Nodes
And relationships between nodes
data:image/s3,"s3://crabby-images/fa761/fa7618c633c77c13746c9698f82092c3821f3e9f" alt=""
data:image/s3,"s3://crabby-images/973a1/973a1a62940f597b62c54eeb876cc07d7fc01494" alt=""
Property Graph Model
Animal
"name" : "Banana",
"color" : "orange"
Furnishing
"type" : "carpet"
Scratched
"timestamp" : 1417126199
Barfed On
"timestamp" : 1417136235
- Nodes and relationships have properties
- Relationships have names and a direction
- Any number of relationships between nodes
Querying with Cypher
Parentheses = nodes
(user)
Arrows and square brackets = relationships
-[:has]->
Together
(user)-[:has]->(role)
Querying with Cypher
Identifiers
(foo:User)-[bar:has]->()
- Used for referring to later in the query
Labels
- let you differentiate between nodes
- multiple are allowed
(creature:Dog)
Relationship types
- required, only one allowed
[:purchased]
Querying with Cypher
Properties
(user {name:"Daniel"}) -[:purchased {timestamp: 1417631772}]->
NoSQL
Graph databases are part of the NoSQL family
... but they're pretty different.
NoSQL
Aggregate-Oriented
- Key/Value
- Column Family
- Document Store
2 main categories of NoSQL databases:
Graph Databases
When people talk about NoSQL, they're usually referring to the aggregate oriented databases.
Graph vs Relational Databases
Let's build a to-do list app
- Users
- Tasks
- Users can be friends
- Users can create a task
- Users can assign tasks to friends or themselves
To-do List: Relational
data:image/s3,"s3://crabby-images/5d944/5d9444c78a6650e17611231a3121fc4fca53400f" alt=""
Relationships
- Relationships between tables use joins, may require pivot table
- example - get the tasks owned by Daniel, and the users assigned to those tasks:
SELECT tasks.id, tasks.description, users.name owner, users2.name assignee
FROM tasks
JOIN users ON users.id = tasks.creator_id
JOIN users users2 ON users2.id = tasks.assignee_id
WHERE users.name = "Daniel";
- “join-intensive query performance deteriorates as the dataset gets bigger” 1
1 - "Graph Databases" e-book, O'Reilly, pg 8
To-do List: Graph
data:image/s3,"s3://crabby-images/6c570/6c5708cc458cb7a87627fbdfaae6e74e76d4d26d" alt=""
Stored the same way you might describe your data naturally
Reflects your application's domain
Relationships
- Relationships between nodes are "first class citizens"
- just as important as nodes
- Allows for "index free adjacency"
- nodes point directly to connected nodes
- Same query as before - find user's owned tasks:
MATCH (User {name:"Daniel"})-[:OWNS]->(task),
(task)<-[:IS_ASSIGNED_TO]-(assignee)
RETURN task, assignee
Even with many 'joins', query performance is only limited by the portion of the graph that is searched.
Performance
data:image/s3,"s3://crabby-images/957a0/957a0a588010949113f7cb31184451cec5d4ea84" alt=""
This graph search...
data:image/s3,"s3://crabby-images/6ee67/6ee67248248f413844737f3ca70d34220816fc61" alt=""
... has the same performance as this one
Graph advantages summarized:
- Graph databases are a natural representation of your data.
- Graph query languages like Cypher are extremely readable.
- Graphs make complex questions simpler to ask.
- When dealing with many relationships, performance can be improved significantly over both relational and aggregate-oriented NoSQL databases.
- They're fun, especially with visualizations.
So let's switch all our apps to graph databases!
Hold your horses.
Considerations
- Don't just use because you think it'll get you better performance
- Have you really optimized your current system?
- Getting peak performance in graph databases can also require work, such as:
- modeling your data so queries are limited in scope
- optimizing queries
- using Java-based extensions to Neo4j
- configuration for load balancing
- queue or batch writes
- Take time to investigate the use cases that graph databases excel at
PHP
- REST API wrappers
- ORM
- More at Neo4j
Resources
Videos
References / Training
Books
- "Graph Databases" - O'Reilly
- "Learning Neo4J" - Packt
PHP/Neo4j People
- Michelle Sanver
- Christophe Willemsen
- Ed Finkler
Intro to Graph Databases for PHP Developers
By Daniel Abernathy
Intro to Graph Databases for PHP Developers
- 2,113