Graphs and Neo4j - from Hydropower Plants to PCBs
#froscon @hannelita

Hi!
- Computer Engineering
- Programming
- Electronics
- Mathematics
- Physics
- Lego
- Meetups
- Coffee
- GIFs

@hannelita

Disclaimer
This content represents the speaker's personal overview.
Feedback (positive or not) accepted here: hannelita@gmail.com
Not all numeric data or names are true, aiming not to harm company secrets.
Modelling cases are real; apologises for technical terms flood.
Disclaimer
This talk is based on Neo4j 2.x; Version 3.x was released recently.
Structure - Cases
- Use case context
- Modelling with relational databases (and fails)
- Graph modelling
- Evolving the model
- Epic fails
Final Considerations
- Main benefits
- Support tools

Title Text
Have you ever been into the darkness?
Running out of energy
Candle lights <3

Brazil is a huge producer of electrical energy.

Specially from hydropower plants

Case 1 - Context


How do we distribute electrical energy? How are the power plants distributed?

http://sigel.aneel.gov.br/sigel.html

Accessed in 28/3/2016
http://sigel.aneel.gov.br/sigel.html
Access in 28/3/2016

Map information
- Power plant location
- Transmission lines
- Supply capacity
- Total capacity
- Nearby cities
- Distribution
- Boundaries / States
- Hydrographic basin
- Dealers
Electrical
Political
Environmental
Economy

Challenge
Build a sytem that stores all these information and how the data is related.

Case 1 - Modelling with relational databases

Electrical
Political
Environmental
Economy
CREATE TABLE power_plant;
CREATE TABLE city;
CREATE TABLE hydrographic_basin;
CREATE TABLE dealer;
Question 1:
How do you represent a power plant neighbourhood?

- Self-relationship;
- Denormalisation (neighbours_ids)

Question 2:
Which is the best power plant to provide energy for a group of cities?
id | capacity ( Mwh) | transmission_line (PK) | coordinate |
---|---|---|---|
1 | 95 | 22 | |
2 | 11 | 1 |
- Given a coordinate data set, sum the population inside the resultant polygon.

id | usage (month, in Mwh) | population (milion) | coordinate |
---|---|---|---|
1 | 40 | 13 | |
2 | 11 | 2 |

city
power_plant
Question 2:
Which is the best power plant to provide energy for a group of cities?
id | usage (month, in Mwh) | population (milion) | coordinate |
---|---|---|---|
1 | 40 | 13 | |
2 | 11 | 2 |
id | capacity ( Mwh) | transmission_line (PK) | coordinate |
---|---|---|---|
1 | 95 | 22 | |
2 | 11 | 1 |

2. Match power plants coordinates based on supply capacity
city
power_plant
Question 2:
Which is the best power plant to provide energy for a group of cities?
id | usage (month, in Mwh) | population (milion) | coordinate |
---|---|---|---|
1 | 40 | 13 | |
2 | 11 | 2 |
id | capacity ( Mwh) | transmission_line (PK) | coordinate |
---|---|---|---|
1 | 95 | 22 | |
2 | 11 | 1 |

3. Verify properties into transmission_lines table
city
power_plant
It is not that difficult


It is not over!

Question 2:
Which is the best power plant to provide energy for a group of cities?
id | usage (month, in Mwh) | population (milion) | coordinate |
---|---|---|---|
1 | 40 | 13 | |
2 | 11 | 2 |
id | capacity ( Mwh) | transmission_line (PK) | coordinate |
---|---|---|---|
1 | 95 | 22 | |
2 | 11 | 1 |

4. Verify if there are industries nearby
city
power_plant
Question 2:
Which is the best power plant to provide energy for a group of cities?
id | usage (month, in Mwh) | population (milion) | coordinate |
---|---|---|---|
1 | 40 | 13 | |
2 | 11 | 2 |
id | capacity ( Mwh) | transmission_line (PK) | coordinate |
---|---|---|---|
1 | 95 | 22 | |
2 | 11 | 1 |

5. Verify HDI
city
power_plant
Question 2:
Which is the best power plant to provide energy for a group of cities?
id | usage (month, in Mwh) | population (milion) | coordinate |
---|---|---|---|
1 | 40 | 13 | |
2 | 11 | 2 |
id | capacity ( Mwh) | transmission_line (PK) | coordinate |
---|---|---|---|
1 | 95 | 22 | |
2 | 11 | 1 |

6. Verify dealers interest.
city
power_plant
Question 2:
Which is the best power plant to provide energy for a group of cities?
id | usage (month, in Mwh) | population (milion) | coordinate |
---|---|---|---|
1 | 40 | 13 | |
2 | 11 | 2 |
id | capacity ( Mwh) | transmission_line (PK) | coordinate |
---|---|---|---|
1 | 95 | 22 | |
2 | 11 | 1 |

7. Verify if the region has alternative energy sources
city
power_plant


Question 3:
Assuming that hydropower plants work as tug-of-war with multiple endpoints, how do you redistribute the electrical charges if one plant shuts down?

Question 3:

Assuming that hydropower plants work as tug-of-war with multiple endpoints, how do you redistribute the electrical charges if one plant shuts down?

Maybe tables are not the best structures to represent information about energy distribution.

Neo4j comes to rescue!


Quick intro - Neo4j
- Graph oriented database
- ACID
- Structures: Node, Relationship, Index and Label
- Maintained by Neotechnology
- Open Source
- Active community

Case 1 - Graph Modelling

Step 1 - Power plants become nodes


Powered by Arrows - http://www.apcjones.com/arrows/#
CREATE (n:PowerPlant:HydropowerPlant { name : 'Itaipu', capacity : '14000' })

Usina => Power Plant
Hidreletrica => Hydropower
capacidade => capacity
Step 2 - Cities become nodes

Step 3 - Transmission lines become relationships!

Itaipu - Ivaiporã



MATCH (a:HidropowerPlant),(b:City)
WHERE a.name = 'Itaipu' AND b.name = 'Ivaipora'
CREATE (a)-[r:PROVIDES { cable_capacity : 765, rl : 330 }]->(b)

Multiple relationships for several lines


MATCH (a:HidrepowerPlant),(b:City)
WHERE a.name = 'Itaipu' AND b.name = 'Cascavel Oeste'
CREATE (a)-[r:PROVIDES { cable_capacity : 500 }]->(b)
MATCH (a:City),(b:City)
WHERE a.name = 'Ivaipora' AND b.name = 'Cascavel Oeste'
CREATE (a)-[r:MESH { capacidade_cabo : 500 }]->(b)

Step 4 - Dealers become nodes


CREATE (n:Dealer { name : 'Fake',
percentage : 85, margin : 72 })
MATCH (a:Dealer),(b:City)
WHERE a.name = 'Ficticio' AND b.name = 'Cascavel Oeste'
CREATE (a)-[r:ATTENDS]->(b)
MATCH (a:Dealer),(b:PowerPlant)
WHERE a.name = 'Ficticio' AND b.name = 'Ita'
CREATE (a)-[r:OWNS]->(b)

Step 5 - Queries poderosas
MATCH (n:PowerPlant {capacity : 14000}),
(c:City {name : 'Sao Paulo'})
p = shortestPath((n)-[]-(c)) RETURN p
Queries determine optinal paths for energy supply

Case 1 - Evolving the model

Important: add Indexes for the most frequently used properties
Capacity, population, coordinates

Important[2]: Labels
:City, :PowerPlant, :Region
Usually, elements that can be grouped deserve a label.

More: turn other electrical elements into nodes


CREATE (n:Component:Transformer
{ tag : 'F. Iguacu', type : 'Terciario', mva : 1650, total : 4 })
MATCH (a:Transformer),(b:PowerPlant)
WHERE a.tag = 'F. Iguacu' AND b.name = 'Itaipu'
CREATE (a)-[r:INSTALLED]->(b)

Neo4j is flexible for modelling.

Case 1 - Epic Fails


Too many nodes for cities! (There are too many cities)
Problem

Too much information being loaded on MATCH; performance problems
Impact

Remove some :Cities and add :Region label, grouping cities
Solution

Not saving all the CREATE operations into a file
Problem

Problems with backup / replication.
Impact

Do not perform CREATE operations into Web interface!
Add queries into a Git repository - https://github.com/hannelita/qconsp
Solution

Case 1 - Extra - Insights

Find hidden information

Example: mapping the components made a big difference for a deeper model evaluation.

Mapping components...


Case 2 - Context


A-HA! We could use graphs for (...) [complete]

PCB Routing / Trail design


Yes! But we can go further.
Let's analyse the board layout and components display.

Case 2 - Modelling with relational databases

Component
Trail
Sensor
Layer
CREATE TABLE component;
CREATE TABLE trail;
CREATE TABLE sensor;
CREATE TABLE layer;
Question 1:
A sensor detects a temperature raise. How would you infer if it is a problem from a component or from the trail?

Question 1
Usually, you need extra information from the sensors nearby. How do you model that?

- Self-relationship;
- Denormalise (sensors_ids)
Déjà vu!

Question 2:
Which trails do affect more components at the same time? (ex: If Trail A breaks, the entire system stops working)

Question 3:
Is it possible to extract some hidden or unseen information from the circuit by modelling it within a graph?

Case 2 - Graph modelling

Step 1: Components become nodes
CREATE (n:Component:Primary { name : 'R1',
type : 'resistor', value : '10K' })
CREATE (n:Component:Primary { name : 'C1',
type : 'capacitor', group : 'polyester',
value : '100p' })
CREATE (n:Component:CI { name : 'CI1',
type : 'LM741', seller : 'Texas' })

Step 2: Map trails into relationships
MATCH (a:Primary),(c:CI)
WHERE a.name = 'R1' AND c.name = 'CI1'
CREATE (a)-[r:TRAILS { thickness : 2, dilation : 0.5 }]->(c)

Step 3: Map Layers into Labels
CREATE (n:Component:Primary:LAYER1
{ name : 'R1', type : 'resistor', value : '10K' })
CREATE (n:Component:Primary:LAYER2
{ name : 'C1', type : 'capacitor',
group : 'polyester', value : '100p' })
CREATE (n:Component:CI:LAYER1 { name : 'CI1',
type : 'LM741', seller : 'Texas' })
Easy to fetch all the components from a specific Layer

Case 2 - Evolving the model

Step 4: Map sensors into nodes
CREATE (n:Sensor:LAYER1
{ name : 'SS1', type : 'light'})
CREATE (n:Sensor:LAYER2
{ name : 'SS2', type : 'temperature' })
MATCH (aPrimary),(s:Sensor)
WHERE a.name = 'R1' AND c.name = 'SS1'
CREATE (s)-[MONITORS { light : 2 }]->(a)
MATCH (a:Primary),(s:Sensor)
WHERE a.name = 'R1' AND c.name = 'SS2'
CREATE (s)-[r:MONITORS { temperature : 37 }]->(a)

MATCH (n:Sensor)-[MONITORS]-(c:Component)
WHERE n.temperature > 60
RETURN c.name, r.dilation
Decide if it is the component of if it is the trail that is damaged.
Step 5: Run the following periodic query:

Case 2 - Epic Fails


Too many updates for the sensors; Neo4j has some writing restrictions
Problem

Bad performance and high RAM consumption
Impact

Remove some sensor nodes or jump to Enterprise version.
Solution

Final considerations
- Flexible models
- Find hidden relations
- Easy to get started
- Active tool and active community
- It can be useful in several scenarios, beyond social networks and recommendation systems.

Tools
- Data Import (Relational Databases, MongoDB, Cassandra, JSON, CSV)
- Visualization tools
- REST API

References
- Neo4j Meetup in São Paulo
- Neo4j Slack Users
- Neo4j Training (Free)
- Arrows (Sketching tool)

Special thanks
- Neo Technology, @lyonwj, @ryguyrg e @mesirii
- B.C., for the excellent feedback and review
- @Codeminer42


- Prof. Maurílio and Prof. Justino.
Thank you :)
Questions?
hannelita@gmail.com
@hannelita


Froscon - Graphs and Neo4j - From Hydropower plants to PCBs
By Hanneli Tavante (hannelita)
Froscon - Graphs and Neo4j - From Hydropower plants to PCBs
Graphs and Neo4j - From Hydropower plants to PCBs - English version
- 2,242