Graph mining
Pierluigi Crescenzi
Gran Sasso Science Institute
February 110, 2022
Introduction
Efficient Julia Programs for Understanding Our World
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Binary relationships
 Communications
 Talking on the phone or sending emails or messages
 Collaborations
 Coauthoring a scientific paper or coacting in a movie
 Ratings
 Indicating a "like" for a photo or rating a movie
 Membership
 Being a member of a university department or of a political party
 Dependencies
 Citing a paper within another paper or including a link to a webpage on another webpage
 Transfers
 Making a bank transfer or selling a car
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
The Florentine families social network
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Why graphs?
 Use the many mathematical and computer toolsÂ developed in the field of graph theory
 Analyze mathematical properties of a graph, and design, analyze and develop efficient algorithms that compute these properties
 Final aim
 Reach conclusions that are interesting from the point of view of the specific domain
 Existence of patterns that perhaps were not initially foreseen
 Reach conclusions that are interesting from the point of view of the specific domain
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Interpreting the past
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Interpreting the past
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Interpreting the past
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Interpreting the past
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Understanding the present
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Understanding the present
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Imagining the future
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Imagining the future
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Some definitions
 Arc directions
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Some definitions
 Density
 \(\delta_G = \frac{m}{\frac{n (n1)} {2}} = \frac{2m}{n (n1)}\)
 Sparse if \(\delta\) close to 0
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Some definitions
 Node labels
 Name, age, height,...
 Edge weights
 Strength/cost of relationship
 Signed graphs: weights \(+1\) and \(1\)
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Some definitions
 Neighbor of node \(x\): node connected to \(x\) by an arc
 Neighborhood \(N(x)\) of \(x\): set of neighbors of \(x\)
 Degree of \(x\): \(N(x)\)
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Some definitions
 Selfloop
 Multiarc
 Simple graph: no selfloop and no multiarc
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Some definitions
 Path, eccentricity, diameter
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Some definitions
 (Strongly, weakly) connected graphs
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Graph representations
 Adjacency matrix
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Graph representations
 Adjacency lists
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Breadthfirst search
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Breadthfirst search
 Implementation

Starting node \(s\) added to queue, marked as visited and assigned to level 0 with itself as predecessor

While queue not empty

First node \(x\) of the queue at level \(l\) extracted and marked as explored

For each neighbor \(y\) of \(x\), if \(y\) neither visited nor explored, \(y\) added to queue, marked as visited and assigned level \(l + 1\) with \(x\) as a predecessor


Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Breadthfirst search
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Breadthfirst search
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Breadthfirst search
Pierluigi Crescenzi
February 2022
Graph Mining
Introduction
Breadthfirst search
 Applications
 Computing connected components
 Deciding whether a directed graphs is strongly connected
 Two BFS, one forward and one backward
 Weighted graphs
 Dijkstra algorithm
 Computing strongly connected components
 More sophisticated algorithm