Distributed Betweenness Centrality

Introduction

Betweenness Centrality

Measure of importance based on communication flow
- Nodes with high betweenness centrality lie on communication paths and can control information flow
Formally, for each node \(v\)
- \(\mathsf{bc}_v=\frac{1}{(n-1)(n-2)}\sum_{s\neq v, t\neq v}\frac{\sigma_{s,t}(v)}{\sigma_{s,t}}\) where
  - \(σ_{s,t}(v)\) = #shortest \(s\)-\(t\) paths passing trough \(v\)
  - \(σ_{s,t}\) = #shortest \(s\)-\(t\) paths
Applies to wide range of problems
- Social networks
- Biology
- Transport
- Scientific cooperation
- ...

Example

s,t		1	2	3	4	5
1,2	1			0	0	0
1,3	1		0		0	0
1,4	2		1	1		0
1,5	2		1	1	2
2,3	2	1			1	0
2,4	1	0		0		0
2,5	1	0		0	1
3,4	1	0	0			0
3,5	1	0	0		1
4,5	1	0	0	0

\(\sigma_{s,t}\)

\(\sigma_{s,t}(v)\)

\(\mathsf{bc}_1=\frac{1}{2}\)
\(\mathsf{bc}_2=\frac{1}{2}+\frac{1}{2}=1\)
\(\mathsf{bc}_3=\frac{1}{2}\)
\(\mathsf{bc}_4=1+\frac{1}{2}+1+1=\frac{7}{2}\)
\(\mathsf{bc}_5=0\)

Communication networks

Applications
- Wireless mesh networks design
- Security
- Transmission rates optimisation
- Topology control
- Resource placement and allocation
- Link-sensing
- Routing
- ...
Our motivation
- Frequency of hello messages for link-sensing in wireless networks : \(f(v) \approx \sqrt\frac{\deg_v}{\mathsf{bc}_v}\)
- Objective
  - Integrate bc computation in routing protocols

Our result

Routing protocols
- Link-state
  - Each node knows the entire graph
  - The computation of the betweenness centrality may require excessive computational resources
    - \(O(nm)\) sequential time
- Distance-vector
  - Each node knows the next hop towards each target node
  - No known efficient algorithms for computing betweenness centrality
  - We provide such an algorithm
    - Simple and fast
      - Assuming polynomial number of shortest paths
        
        Otherwise approximation

Simple and fast

Objective
- Design an algorithm for betweenness centrality
  of complexity similar to the one of distributed Bellman-Ford

Preliminaries

Simple facts

If \(\sigma_{s,t}(v)\neq0\) then \(\sigma_{s,t}(v) = \sigma_{s,v}\sigma_{v,t}\)

If the arc \((u,v)\) belongs to a shortest path from \(s\) to \(t\), then \(\sigma_{s,t}(u,v) = \sigma_{s,u}\sigma_{v,t}\)

Simple facts

\(NH_v(t)\) : set of next-hops towards \(t\)
\(PH_v(s)\) : set of nodes for which \(v\) is predecessor in shortest path from \(s\)

For every \(t\neq v\), \(\sigma_{v,t}=\sum_{u\in NH_v(t)}\sigma_{u,t}\)
If \(v\) is in a shortest path from \(s\) to \(t\), \(\sigma_{v,t}=\sum_{u\in PH_v(s)}\sigma_{v,t}(v,u)\)

Less simple fact

From definition, \(\mathsf{bc}_v=\frac{1}{(n-1)(n-2)}\sum_{s\neq v}\mathsf{bc}_v(s)\) where
- \(\mathsf{bc}_v(s)=\sum_{t\neq v} \frac{\sigma_{s,t}(v)}{\sigma_{s,t}}\)
Is not difficult to prove that\[\mathsf{bc}_v(s)=\sigma_{s,v} \sum_{u\in PH_v(s)}\frac{\mathsf{bc}_u(s)+1}{\sigma_{s,u}}\]
- From global definition to local definition
  - We need information only from a subset of neighbors

The distributed algorithm

A first simple version

\(\sigma_{v,t}=\sum_{u\in NH_v(t)}\sigma_{u,t}\)

\(\mathsf{bc}_v(s)=\sigma_{s,v} \sum_{u\in PH_v(s)}\frac{\mathsf{bc}_u(s)+1}{\sigma_{s,u}}\)

\(\mathsf{bc}_v=\frac{1}{(n-1)(n-2)}\sum_{s\neq v}\mathsf{bc}_v(s)\)

A first simple version

Theorem
- Algorithm 2 enables every node to compute its betweenness centrality in any network G after 2D+1 phases

A more efficient quite simple version

Experimental results I

Global error

\(\frac{\|\mathsf{bc}-C\|_2}{\|\mathsf{bc}\|_2}=\frac{\sqrt{\sum_{v\in V}(\mathsf{bc}_v-C[v])^2}}{\sqrt{\sum_{v\in V}(\mathsf{bc}_v)^2}}\)

How far are current values \(C\) from final values \(\mathsf{bc}\)

To be computed at the end of each send-receive phase

Grids and hypercubes

\(7\times 6\) grid and hypercube of dimension 11
- Hence, diameter is 11

Erdös-Renyi

Erdös-Renyi graphs with 500 nodes and different diameters
- 20 samples for each diameter

Real-world networks

E-mail network with 1133 nodes and diameter 8
Autonomous system network with 3011 nodes and diameter 9

Weighted Erdös-Renyi

Randomly weighted Erdös-Renyi with 500 nodes
- Diameter noted is diameter of underlying, unweighted graphs

Weighted real-world networks

Road network with 3353 nodes
- Rome, Italy, 1999

Experimental results II

Local error

\(T_D\) : time it takes for the Bellman-Ford algorithm to converge locally
- I.e., until the distances are correctly computed
\(T_C\) : time it takes for the betweenness centrality value to converge locally
Local convergence time vs betweenness

betweenness

time

\(b\)

\(t\)

There is at least one node that converged in time \(t\), with betweenness centrality \(b\)
- Don't show how many

AS network

Black : \(T_D\)
Red : \(T_C\)

Road network

Conclusion

The open problem

Assuming polynomial number of shortest paths
- \(\mathrm{CONGEST}(B)\)
  - Variant of the CONGEST model in which at most \(B\) words of \(O(\log n)\) bits each can be sent through each link at each round
- Our distributed algorithm for weighted graphs applies to the \(\mathrm{CONGEST}(n)\) model, and converges in \(O(D)\) rounds
- The known distributed algorithms for unweighted graphs apply to the \(\mathrm{CONGEST}(1)\) model, and converge in \(O(n)\) rounds
- Open problem
  - Compute exact betweenness centrality of weighted graphs in \(O(\frac{n}{B} + D)\) rounds in the \(\mathrm{CONGEST}(B)\) model, for \(1 \leq B \leq n\)

On Computing Betweenness Centrality in a Distributed Environment

Pierluigi Crescenzi

Pierre Fraigniaud

Ami Paz

Introduction

Betweenness Centrality

Example

Communication networks

Our result

Simple and fast

Preliminaries

Simple facts

Simple facts

Less simple fact

The distributed algorithm

A first simple version

A first simple version

A more efficient quite simple version

Experimental results I

Global error

Grids and hypercubes

Erdös-Renyi

Real-world networks

Weighted Erdös-Renyi

Weighted real-world networks

Experimental results II

Local error

AS network

Road network

Conclusion

The open problem

Thank you

s,t		1	2	3	4	5
1,2	1			0	0	0
1,3	1		0		0	0
1,4	2		1	1		0
1,5	2		1	1	2
2,3	2	1			1	0
2,4	1	0		0		0
2,5	1	0		0	1
3,4	1	0	0			0
3,5	1	0	0		1
4,5	1	0	0	0

s,t		1	2	3	4	5
1,2	1			0	0	0
1,3	1		0		0	0
1,4	2		1	1		0
1,5	2		1	1	2
2,3	2	1			1	0
2,4	1	0		0		0
2,5	1	0		0	1
3,4	1	0	0			0
3,5	1	0	0		1
4,5	1	0	0	0

s,t		1	2	3	4	5
1,2	1			0	0	0
1,3	1		0		0	0
1,4	2		1	1		0
1,5	2		1	1	2
2,3	2	1			1	0
2,4	1	0		0		0
2,5	1	0		0	1
3,4	1	0	0			0
3,5	1	0	0		1
4,5	1	0	0	0