Graph mining

Pierluigi Crescenzi

Gran Sasso Science Institute

Introduction

Efficient (Julia) Programs
for Understanding Our
Networked World

Pierluigi Crescenzi

Graph Mining

Introduction

Binary relationships

  • Communications
    • Talking on the phone or sending emails or messages
  • Collaborations
    • Co-authoring a scientific paper or co-acting in a movie
  • Ratings
    • Indicating a "like" for a photo or rating a movie
  • Membership
    • Being a member of a university department or of a political party
  • Dependencies
    • Citing a paper within another paper or including a link to a web-page on another web-page
  • Transfers
    • Making a bank transfer or selling a car

Pierluigi Crescenzi

Graph Mining

Introduction

Why graphs?

  • Use the many mathematical and computer tools  developed in the field of graph theory
    • Analyze mathematical properties of a graph, and design, analyze and develop efficient algorithms that compute these properties
  • Final aim
    • Reach conclusions that are interesting from the point of view of the specific domain
      • Existence of patterns that perhaps were not initially foreseen

Pierluigi Crescenzi

Graph Mining

Introduction

The Florentine families social network

Pierluigi Crescenzi

Graph Mining

Introduction

Interpreting the past

Pierluigi Crescenzi

Graph Mining

Introduction

Interpreting the past

Pierluigi Crescenzi

Graph Mining

Introduction

Interpreting the past

Pierluigi Crescenzi

Graph Mining

Introduction

Interpreting the past

Pierluigi Crescenzi

Graph Mining

Introduction

Understanding the present

Pierluigi Crescenzi

Graph Mining

Introduction

Understanding the present

Pierluigi Crescenzi

Graph Mining

Introduction

Imagining the future

Pierluigi Crescenzi

Graph Mining

Introduction

Imagining the future

Pierluigi Crescenzi

Graph Mining

Introduction

Some definitions

  • Arc directions

Pierluigi Crescenzi

Graph Mining

Introduction

Some definitions

  • Density
    • \(\delta_G = \frac{m}{\frac{n (n-1)} {2}} = \frac{2m}{n (n-1)}\)
    • Sparse if \(\delta\) close to 0

Pierluigi Crescenzi

Graph Mining

Introduction

Some definitions

  • Node labels
    • Name, age, height,...
  • Edge weights
    • Strength/cost of relationship
    • Signed graphs: weights \(+1\) and \(-1\)

Pierluigi Crescenzi

Graph Mining

Introduction

Some definitions

  • Neighbor of node \(x\): node connected to \(x\) by an arc
  • Neighborhood \(N(x)\) of \(x\): set of neighbors of \(x\)
  • Degree of \(x\): \(|N(x)|\)

Pierluigi Crescenzi

Graph Mining

Introduction

Some definitions

  • Self-loop
  • Multi-arc
  • Simple graph: no self-loop and no multi-arc

Pierluigi Crescenzi

Graph Mining

Introduction

Some definitions

  • Path, eccentricity, diameter

Pierluigi Crescenzi

Graph Mining

Introduction

Some definitions

  • (Strongly, weakly) connected graphs

Pierluigi Crescenzi

Graph Mining

Introduction

Graph representations

  • Adjacency matrix

Pierluigi Crescenzi

Graph Mining

Introduction

Graph representations

  • Adjacency lists

Pierluigi Crescenzi

Graph Mining

Introduction

Breadth-first search

Pierluigi Crescenzi

Graph Mining

Introduction

Breadth-first search

  • Implementation
    1. Starting node \(s\) added to queue, marked as visited and assigned to level 0 with itself as predecessor

    2. While queue not empty

      1. First node \(x\) of the queue at level \(l\) extracted and marked as explored

      2. For each neighbor \(y\) of \(x\), if \(y\) neither visited nor explored, \(y\) added to queue, marked as visited and assigned level \(l + 1\) with \(x\) as a predecessor

Pierluigi Crescenzi

Graph Mining

Introduction

Breadth-first search

Pierluigi Crescenzi

Graph Mining

Introduction

Breadth-first search

Pierluigi Crescenzi

Graph Mining

Introduction

Breadth-first search

Pierluigi Crescenzi

Graph Mining

Introduction

Breadth-first search

  • Applications
    • Computing connected components
    • Deciding whether a directed graphs is strongly connected
      • Two BFS, one forward and one backward
  • Weighted graphs
    • Dijkstra algorithm
  • Computing strongly connected components
    • More sophisticated algorithm