## Graph mining

### Pierluigi Crescenzi

Gran Sasso Science Institute

February 1-10, 2022

## Introduction

### Binary relationships

• Communications
• Talking on the phone or sending emails or messages
• Collaborations
• Co-authoring a scientific paper or co-acting in a movie
• Ratings
• Indicating a "like" for a photo or rating a movie
• Membership
• Being a member of a university department or of a political party
• Dependencies
• Citing a paper within another paper or including a link to a web-page on another web-page
• Transfers
• Making a bank transfer or selling a car

### Why graphs?

• Use the many mathematical and computer tools  developed in the field of graph theory
• Analyze mathematical properties of a graph, and design, analyze and develop efficient algorithms that compute these properties
• Final aim
• Reach conclusions that are interesting from the point of view of the specific domain
• Existence of patterns that perhaps were not initially foreseen

### Some definitions

• Arc directions

### Some definitions

• Density
• $$\delta_G = \frac{m}{\frac{n (n-1)} {2}} = \frac{2m}{n (n-1)}$$
• Sparse if $$\delta$$ close to 0

### Some definitions

• Node labels
• Name, age, height,...
• Edge weights
• Strength/cost of relationship
• Signed graphs: weights $$+1$$ and $$-1$$

### Some definitions

• Neighbor of node $$x$$: node connected to $$x$$ by an arc
• Neighborhood $$N(x)$$ of $$x$$: set of neighbors of $$x$$
• Degree of $$x$$: $$|N(x)|$$

### Some definitions

• Self-loop
• Multi-arc
• Simple graph: no self-loop and no multi-arc

### Some definitions

• Path, eccentricity, diameter

### Some definitions

• (Strongly, weakly) connected graphs

### Graph representations

• Adjacency matrix

### Graph representations

• Adjacency lists

### Breadth-first search

• Implementation
1. Starting node $$s$$ added to queue, marked as visited and assigned to level 0 with itself as predecessor

2. While queue not empty

1. First node $$x$$ of the queue at level $$l$$ extracted and marked as explored

2. For each neighbor $$y$$ of $$x$$, if $$y$$ neither visited nor explored, $$y$$ added to queue, marked as visited and assigned level $$l + 1$$ with $$x$$ as a predecessor

### Breadth-first search

• Applications
• Computing connected components
• Deciding whether a directed graphs is strongly connected
• Two BFS, one forward and one backward
• Weighted graphs
• Dijkstra algorithm
• Computing strongly connected components
• More sophisticated algorithm