## Graphs and Neo4j - from Hydropower Plants to PCBs

#froscon @hannelita

## Hi!

• Computer Engineering
• Programming
• Electronics
• Mathematics
• Physics
• Lego
• Meetups
• Coffee
• GIFs

@hannelita

# Disclaimer

## Structure - Cases

• Use case context
• Modelling with relational databases (and fails)
• Graph modelling
• Evolving the model
• Epic fails

## Final Considerations

• Main benefits
• Support tools

# Case 1 - Context

How do we distribute electrical energy? How are the power plants distributed?

http://sigel.aneel.gov.br/sigel.html

Accessed in 28/3/2016

http://sigel.aneel.gov.br/sigel.html

Access in 28/3/2016

### Map information

3
• Power plant location
• Transmission lines
• Supply capacity
• Total capacity
• Nearby cities
• Distribution
• Boundaries / States
• Hydrographic basin
• Dealers

Electrical

Political

Environmental

Economy

# Case 1 - Modelling with relational databases

Electrical

Political

Environmental

Economy

CREATE TABLE power_plant;

1

CREATE TABLE city;

1

CREATE TABLE hydrographic_basin;

1

CREATE TABLE dealer;

1

# Question 1:

## How do you represent a power plant neighbourhood?

• Self-relationship;
• Denormalisation (neighbours_ids)
1

# Question 2:

### Which is the best power plant to provide energy for a group of cities?

id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1
1. Given a coordinate data set, sum the population inside the resultant polygon.
id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2

city

power_plant

# Question 2:

### Which is the best power plant to provide energy for a group of cities?

id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

2. Match power plants coordinates based on supply capacity

city

power_plant

# Question 2:

### Which is the best power plant to provide energy for a group of cities?

id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

3. Verify properties into transmission_lines table

city

power_plant

# Question 2:

### Which is the best power plant to provide energy for a group of cities?

id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

4. Verify if there are industries nearby

city

power_plant

# Question 2:

### Which is the best power plant to provide energy for a group of cities?

id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

5. Verify HDI

city

power_plant

# Question 2:

### Which is the best power plant to provide energy for a group of cities?

id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

6. Verify dealers interest.

city

power_plant

# Question 2:

### Which is the best power plant to provide energy for a group of cities?

id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

7. Verify if the region has alternative energy sources

city

power_plant

# Question 3:

### Assuming that hydropower plants work as tug-of-war with multiple endpoints, how do you redistribute the electrical charges if one plant shuts down?

Maybe tables are not the best structures to represent information about energy distribution.

# Neo4j comes to rescue!

## Quick intro - Neo4j

• Graph oriented database
• ACID
• Structures: Node, Relationship, Index and Label
• Maintained by Neotechnology
• Open Source
• Active community

# Step 1 - Power plants become nodes

CREATE (n:PowerPlant:HydropowerPlant { name : 'Itaipu', capacity : '14000' })

Usina => Power Plant

Hidreletrica => Hydropower

3

3

# Itaipu - Ivaiporã

MATCH (a:HidropowerPlant),(b:City)

WHERE a.name = 'Itaipu' AND b.name = 'Ivaipora'

CREATE (a)-[r:PROVIDES { cable_capacity : 765, rl : 330 }]->(b)

# Multiple relationships for several lines

``````MATCH (a:HidrepowerPlant),(b:City)
WHERE a.name = 'Itaipu' AND b.name = 'Cascavel Oeste'
CREATE (a)-[r:PROVIDES { cable_capacity : 500 }]->(b)

MATCH (a:City),(b:City)
WHERE a.name = 'Ivaipora' AND b.name = 'Cascavel Oeste'
CREATE (a)-[r:MESH { capacidade_cabo : 500 }]->(b)``````

# Step 4 - Dealers become nodes

3
``````CREATE (n:Dealer { name : 'Fake',
percentage : 85, margin : 72 })

MATCH (a:Dealer),(b:City)
WHERE a.name = 'Ficticio' AND b.name = 'Cascavel Oeste'
CREATE (a)-[r:ATTENDS]->(b)

MATCH (a:Dealer),(b:PowerPlant)
WHERE a.name = 'Ficticio' AND b.name = 'Ita'
CREATE (a)-[r:OWNS]->(b)``````

# Step 5 - Queries poderosas

3
``````MATCH (n:PowerPlant {capacity : 14000}),
(c:City {name : 'Sao Paulo'})
p = shortestPath((n)-[]-(c)) RETURN p``````

Queries determine optinal paths for energy supply

# Important: add Indexes for the most frequently used properties

Capacity, population, coordinates

# Important[2]: Labels

:City, :PowerPlant, :Region

Usually, elements that can be grouped deserve a label.

# More: turn other electrical elements into nodes

``````CREATE (n:Component:Transformer
{ tag : 'F. Iguacu', type : 'Terciario', mva : 1650, total : 4 })

MATCH (a:Transformer),(b:PowerPlant)
WHERE a.tag = 'F. Iguacu' AND b.name = 'Itaipu'
CREATE (a)-[r:INSTALLED]->(b)``````

# Do not perform CREATE operations into Web interface!

Add queries into a Git repository - https://github.com/hannelita/qconsp

# Case 2 - Modelling with relational databases

Component

Trail

Sensor

Layer

CREATE TABLE component;

1

CREATE TABLE trail;

1

CREATE TABLE sensor;

1

CREATE TABLE layer;

1

# Question 1

## Usually, you need extra information from the sensors nearby. How do you model that?

• Self-relationship;
• Denormalise (sensors_ids)
1

Déjà vu!

# Step 1: Components become nodes

``````CREATE (n:Component:Primary { name : 'R1',
type : 'resistor', value : '10K' })

CREATE (n:Component:Primary { name : 'C1',
type : 'capacitor', group : 'polyester',
value : '100p' })

CREATE (n:Component:CI { name : 'CI1',
type : 'LM741', seller : 'Texas' })``````

# Step 2: Map trails into relationships

``````MATCH (a:Primary),(c:CI)
WHERE a.name = 'R1' AND c.name = 'CI1'
CREATE (a)-[r:TRAILS { thickness : 2, dilation : 0.5 }]->(c)``````

# Step 3: Map Layers into Labels

``````CREATE (n:Component:Primary:LAYER1
{ name : 'R1', type : 'resistor', value : '10K' })

CREATE (n:Component:Primary:LAYER2
{ name : 'C1', type : 'capacitor',
group : 'polyester',  value : '100p' })

CREATE (n:Component:CI:LAYER1 { name : 'CI1',
type : 'LM741', seller : 'Texas' })``````

Easy to fetch all the components from a specific Layer

# Step 4: Map sensors into nodes

``````CREATE (n:Sensor:LAYER1
{ name : 'SS1', type : 'light'})

CREATE (n:Sensor:LAYER2
{ name : 'SS2', type : 'temperature' })

MATCH (aPrimary),(s:Sensor)
WHERE a.name = 'R1' AND c.name = 'SS1'
CREATE (s)-[MONITORS { light : 2 }]->(a)

MATCH (a:Primary),(s:Sensor)
WHERE a.name = 'R1' AND c.name = 'SS2'
CREATE (s)-[r:MONITORS { temperature : 37 }]->(a)``````
``````MATCH (n:Sensor)-[MONITORS]-(c:Component)
WHERE n.temperature > 60
RETURN c.name, r.dilation``````

Decide if it is the component of if it is the trail that is damaged.

Step 5: Run the following periodic query:

# Solution

## Final considerations

1
• Flexible models
• Find hidden relations
• Easy to get started
• Active tool and active community
• It can be useful in several scenarios, beyond social networks and recommendation systems.

## Tools

1
• Data Import  (Relational Databases, MongoDB, Cassandra, JSON, CSV)
• Visualization tools
• REST API

## Special thanks

• Neo Technology, @lyonwj, @ryguyrg e @mesirii
• B.C., for the excellent feedback and review
• @Codeminer42
2
• Prof. Maurílio and  Prof. Justino.
2

## Thank you :)

Questions?

hannelita@gmail.com

@hannelita