Graphs and Neo4j - from Hydropower Plants to PCBs

#qconsp @hannelita

 

Hi!

 
  • Computer Engineering
  • Programming
  • Electronics
  • Mathematics
  • Physics
  • Lego
  • Meetups
  • Coffee
  • GIFs
 

#qconsp @hannelita

 

Disclaimer

 

This content represents the speaker's personal overview.

 

Feedback (positive or not) accepted here: hannelita@gmail.com

 

Not all numeric data or names are true, aiming not to harm company secrets.

 

Modelling cases are real; apologises for technical terms flood.

 

Structure - Cases

 
  • Use case context
  • Modelling with relational databases (and fails)
  • Graph modelling
  • Evolving the model
  • Epic fails
 

Final Considerations

 
  • Main benefits
  • Support tools
 

#qconsp @hannelita

 

Title Text

 

Have you ever been into the darkness?

 

Running out of electrical energy

 

Candle lights <3

 

Brazil is a huge producer of electrical energy.

 

#qconsp @hannelita

 

Specially from hydropower plants

 

#qconsp @hannelita

 

Case 1 - Context

 

#qconsp @hannelita

 

How do we distribute electrical energy? How are the power plants distributed?

 

#qconsp @hannelita

 

http://sigel.aneel.gov.br/sigel.html

 

Accessed in 28/3/2016

 

http://sigel.aneel.gov.br/sigel.html

 

Access in 28/3/2016

 

Map information

3
  • Power plant location
  • Transmission lines
  • Supply capacity
  • Total capacity
  • Nearby cities
  • Distribution
  • Boundaries / States
  • Hydrographic basin
  • Dealers
 

Electrical

 

Political

 

Environmental

 

Economy

 

#qconsp @hannelita

 

Challenge

 

Build a sytem that stores all these information and how the data is related.

 

#qconsp @hannelita

 

Case 1 - Modelling with relational databases

 

#qconsp @hannelita

 

Electrical

 

Political

 

Environmental

 

Economy

 

CREATE TABLE power_plant;

1

CREATE TABLE city;

1

CREATE TABLE hydrographic_basin;

1

CREATE TABLE dealer;

1

Question 1:

 

How do you represent a power plant neighbourhood?

 
  • Self-relationship;
  • Denormalisation (neighbours_ids)
1

#qconsp @hannelita

 

Question 2:

 

Which is the best power plant to provide energy for a group of cities?

 
id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2

city

1
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

powe_plant

1
  1. Given a coordinate data set, sum the population inside the resultant polygon.

#qconsp @hannelita

 

Question 2:

 

Which is the best power plant to provide energy for a group of cities?

 
id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2

city

1
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

powe_plant

1

#qconsp @hannelita

 

2. Match power plants coordinates based on supply capacity

Question 2:

 

Which is the best power plant to provide energy for a group of cities?

 
id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2

city

1
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

powe_plant

1

#qconsp @hannelita

 

3. Verify properties into transmission_lines table

It is not that difficult

#qconsp @hannelita

 

It is not over! 

#qconsp @hannelita

 

Question 2:

 

Which is the best power plant to provide energy for a group of cities?

 
id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2

city

1
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

powe_plant

1

#qconsp @hannelita

 

4. Verify if there are industries nearby

Question 2:

 

Which is the best power plant to provide energy for a group of cities?

 
id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2

city

1
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

powe_plant

1

#qconsp @hannelita

 

5. Verify HDI

Question 2:

 

Which is the best power plant to provide energy for a group of cities?

 
id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2

city

1
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

powe_plant

1

#qconsp @hannelita

 

6. Verify dealers interest.

Question 2:

 

Which is the best power plant to provide energy for a group of cities?

 
id usage (month, in Mwh) population (milion) coordinate
1 40 13
2 11 2

city

1
id capacity ( Mwh) transmission_line (PK) coordinate
1 95 22
2 11 1

powe_plant

1

#qconsp @hannelita

 

7. Verify if the region has alternative energy sources

#qconsp @hannelita

 

Question 3:

 

Assuming that hydropower plants work as tug-of-war with multiple endpoints, how do you redistribute the electrical charges if one plant shuts down?

 

#qconsp @hannelita

 

Question 3:

 

#qconsp @hannelita

 

Assuming that hydropower plants work as tug-of-war with multiple endpoints, how do you redistribute the electrical charges if one plant shuts down?

 

Maybe tables are not the best structures to represent information about energy distribution.

#qconsp @hannelita

 

Neo4j comes to rescue!

#qconsp @hannelita

 

Quick intro - Neo4j

  • Graph oriented database
  • ACID
  • Structures: Node, Relationship, Index and Label
  • Maintained by Neotechnology
  • Open Source
  • Active community

#qconsp @hannelita

 

Case 1 - Graph Modelling

 

#qconsp @hannelita

 

Step 1 - Power plants become nodes

#qconsp @hannelita

 

Powered by Arrows - http://www.apcjones.com/arrows/#

CREATE (n:PowerPlant:HydropowerPlant { name : 'Itaipu', capacity : '14000' })
 

#qconsp @hannelita

 

Usina => Power Plant

Hidreletrica => Hydropower

capacidade => capacity

Step 2 - Cities become nodes

3

#qconsp @hannelita

 

Step 3 - Transmission lines become relationships! 

3

#qconsp @hannelita

 

Itaipu - Ivaiporã

#qconsp @hannelita

 

MATCH (a:HidropowerPlant),(b:City)

WHERE a.name = 'Itaipu' AND b.name = 'Ivaipora'

CREATE (a)-[r:PROVIDES { cable_capacity : 765, rl : 330 }]->(b) 

 

#qconsp @hannelita

 

Multiple relationships for several lines

#qconsp @hannelita

 
MATCH (a:HidrepowerPlant),(b:City) 
WHERE a.name = 'Itaipu' AND b.name = 'Cascavel Oeste' 
CREATE (a)-[r:PROVIDES { cable_capacity : 500 }]->(b)

MATCH (a:City),(b:City) 
WHERE a.name = 'Ivaipora' AND b.name = 'Cascavel Oeste' 
CREATE (a)-[r:MESH { capacidade_cabo : 500 }]->(b)

#qconsp @hannelita

 

Step 4 - Dealers become nodes

3

#qconsp @hannelita

 
CREATE (n:Dealer { name : 'Fake', 
percentage : 85, margin : 72 })

MATCH (a:Dealer),(b:City) 
WHERE a.name = 'Ficticio' AND b.name = 'Cascavel Oeste' 
CREATE (a)-[r:ATTENDS]->(b)

MATCH (a:Dealer),(b:PowerPlant) 
WHERE a.name = 'Ficticio' AND b.name = 'Ita' 
CREATE (a)-[r:OWNS]->(b)

#qconsp @hannelita

 

Step 5 - Queries poderosas 

3
MATCH (n:PowerPlant {capacity : 14000}),
      (c:City {name : 'Sao Paulo'})
p = shortestPath((n)-[]-(c)) RETURN p

Queries determine optinal paths for energy supply

#qconsp @hannelita

 

Case 1 - Evolving the model

 

#qconsp @hannelita

 

Important: add Indexes for the most frequently used properties

Capacity, population, coordinates

#qconsp @hannelita

 

Important[2]: Labels

 

:City, :PowerPlant, :Region

Usually, elements can be grouped deserve a label.

 

#qconsp @hannelita

 

More evolving - turn other electrical elements into nodes

#qconsp @hannelita

 
CREATE (n:Component:Transformer 
{ tag : 'F. Iguacu', type : 'Terciario', mva : 1650, total : 4 })


MATCH (a:Transformer),(b:PowerPlant) 
WHERE a.tag = 'F. Iguacu' AND b.name = 'Itaipu' 
CREATE (a)-[r:INSTALLED]->(b)

#qconsp @hannelita

 

Neo4j is flexible for modelling.

 

#qconsp @hannelita

 

Case 1 - Epic Fails

 

#qconsp @hannelita

 

Too many nodes for cities! (There are too many cities)

 

Problem

 

#qconsp @hannelita

 

Too much information being loaded on MATCH; performance problems

 

Impact

 

#qconsp @hannelita

 

Remove some :Cities and add :Region label, grouping cities

 

Solution

 

#qconsp @hannelita

 

Do not save all the CREATE operations into a file

 

Problem

 

#qconsp @hannelita

 

Problems with backup / replication. 

 

Impact

 

#qconsp @hannelita

 

Do not perform CREATE operations into Web interface! 

Add queries into a Git repository - https://github.com/hannelita/qconsp

 

Solution

 

#qconsp @hannelita

 

Case 1 - Extra - Insights

 

#qconsp @hannelita

 

Find hidden information

#qconsp @hannelita

 

Example: mapping the components made a big difference for a deeper model evaluation.

#qconsp @hannelita

 

Mapping components...

#qconsp @hannelita

 

Case 2 - Context

 

#qconsp @hannelita

 

A-HA! We could use graphs for (...) [complete]

#qconsp @hannelita

 

PCB Routing / Trail design 

#qconsp @hannelita

 

Yes! But we can go further.

Let's analyse the board layout and  components display.

#qconsp @hannelita

 

Case 2 - Modelling with relational databases

 

#qconsp @hannelita

 

Component

 

Trail

 

Sensor

 

Layer

 

CREATE TABLE component;

1

CREATE TABLE trail;

1

CREATE TABLE sensor;

1

CREATE TABLE layer;

1

Question 1:

A sensor detects temperature raise. How would you infer if it is a problem from a component or from the trail?

 

#qconsp @hannelita

 

Question 1

 

Usually you need extra information from the neighbours sensor. How do you model that?

 
  • Self-relationship;
  • Denormalise (sensors_ids)
1

Déjà vu!

#qconsp @hannelita

 

Question 2:

 

Which trails does affect more components at the same time? (ex: If Trail A breaks, the entire system stops working)

 

#qconsp @hannelita

 

Question 3:

 

Is it possible to extract some hidden or unseen information from the circuit by modelling it within a graph?

 

#qconsp @hannelita

 

Case 2 - Graph modelling

 

#qconsp @hannelita

 

Step 1: Components become nodes

CREATE (n:Component:Primary { name : 'R1', 
type : 'resistor', value : '10K' })

CREATE (n:Component:Primary { name : 'C1', 
type : 'capacitor', group : 'polyester', 
 value : '100p' })

CREATE (n:Component:CI { name : 'CI1', 
type : 'LM741', seller : 'Texas' })

#qconsp @hannelita

 

Step 2: Map trails into relationships

 
MATCH (a:Primary),(c:CI) 
WHERE a.name = 'R1' AND c.name = 'CI1' 
CREATE (a)-[r:TRAILS { thickness : 2, dilation : 0.5 }]->(c)

#qconsp @hannelita

 

Step 3: Map Layers into Labels

 
CREATE (n:Component:Primary:LAYER1 
{ name : 'R1', type : 'resistor', value : '10K' })

CREATE (n:Component:Primary:LAYER2 
{ name : 'C1', type : 'capacitor', 
group : 'polyester',  value : '100p' })

CREATE (n:Component:CI:LAYER1 { name : 'CI1', 
type : 'LM741', seller : 'Texas' })

Easy to fetch all the components from a specific Layer

#qconsp @hannelita

 

Case 2 - Evolving the model

 

#qconsp @hannelita

 

Step 4: Map sensors into nodes

 
CREATE (n:Sensor:LAYER1 
{ name : 'SS1', type : 'light'})

CREATE (n:Sensor:LAYER2 
{ name : 'SS2', type : 'temperature' })

MATCH (aPrimary),(s:Sensor) 
WHERE a.name = 'R1' AND c.name = 'SS1' 
CREATE (s)-[MONITORS { light : 2 }]->(a)

MATCH (a:Primary),(s:Sensor) 
WHERE a.name = 'R1' AND c.name = 'SS2' 
CREATE (s)-[r:MONITORS { temperature : 37 }]->(a)

#qconsp @hannelita

 
MATCH (n:Sensor)-[MONITORS]-(c:Component)
WHERE n.temperature > 60
RETURN c.name, r.dilation

Decide if it is the component of if it is the trail that is damaged.

Step 5: Run the following periodic query:

#qconsp @hannelita

 

Case 2 - Epic Fails

 

#qconsp @hannelita

 

Too many updates for the sensors; Neo4j has some writing restrictions

 

Problem

 

#qconsp @hannelita

 

Bad performance and high RAM consumption

 

Impact

 

#qconsp @hannelita

 

Remove some sensors node or jump to Enterprise version.

 

Solution

 

#qconsp @hannelita

 

Final considerations

1
  • Flexible models
  • Find hidden relations
  • Easy to get started
  • Active tool and active community
  • It can be useful in several scenarios, beyond social networks and recommendation systems.
 

#qconsp @hannelita

 

Tools

1
  • Data Import  (Relational Databases, MongoDB, Cassandra, JSON, CSV)
  • Visualization tools
  • REST API
 

#qconsp @hannelita

 

References

 

#qconsp @hannelita

 

Special thanks

 
  • Neo Technology, @lyonwj, @ryguyrg e @mesirii 
  • B.C., for the excellent feedback and review
  • @Codeminer42
2

#qconsp @hannelita

 
  • Prof. Maurílio and  Prof. Justino.
2

Thank you :)

Questions?

 

 

hannelita@gmail.com

@hannelita

#qconsp @hannelita

 

Graphs and Neo4j - From Hydropower plants to PCBs

By Hanneli Tavante (hannelita)

Graphs and Neo4j - From Hydropower plants to PCBs

Graphs and Neo4j - From Hydropower plants to PCBs - English version - QCON 2016

  • 5,997