Distance Evolutions in

Preferential Attachment Models

Joost Jorritsma  

joint with Júlia Komjáthy

Franco-Dutch Workshop

July 2021

arXiv: tiny.cc/DistanceEvolutionPAM

Guiding 'real-world' example

Internet network: communicating routers and servers

~1969: 2 connected sites  (UCLA, SRI)

Time

~1971: 15 connected sites

~1989: 0.5 million users

~2020: billions of                              connected devices

  • Evolving/dynamic network:
    • Vertices arrive over time.
    • Connect to present vertices.
    • (Nodes may be removed or replaced.)
    • (Edges between 'old' vertices are removed or placed.)
  • [Faloutsos, Faloutsos & Faloutsos, '99]:
    • # connections per router decays as power-law: \(p_k\sim k^{-\tau}, \tau>2\).
    • Short average hopcount between routers.

Guiding 'real-world' example

Internet network: communicating routers and servers

  • Evolving network: Vertices arrive over time, connect to present vertices.
  • [Faloutsos, Faloutsos & Faloutsos, '99]: Short average hopcount.

1999

\(\mathrm{dist}_{\color{red}{'99}}(u_{'99}, v_{'99}) = 4\)

2005

\(\mathrm{dist}_{{\color{red}'05}}(u_{'99}, v_{'99}) = 3\)

2021

\(\mathrm{dist}_{{\color{red}'21}}(u_{'99}, v_{'99}) = 2\)

  • How does hopcount between old nodes change in growing network?

\(u_{'99}\)

\(v_{'99}\)

Preferential attachment models

Definition

  • Start with a single vertex
  • Vertices enter the network one-by-one at discrete (time-)steps \(t=2, 3,...\).
  • New vertex connects to old vertices according to some increasing function of the degree:
    • Fixed outdegree: \((m,\delta): m\) edges/new vertex (\(\delta>-m\)):
      $$\mathbb{P}\big(v_{t+1} \overset{j}{\longrightarrow} v_i\big) \propto \mathrm{deg}_{t,j}(v_i) + \delta/m.$$
    • Variable outdegree: \(\forall v_i \in \mathrm{PA}_t\), independently, (\(\gamma,\eta\in(0,1)\))
      $$\mathbb{P}\big(v_{t+1} \longrightarrow v_i\big) = (\gamma \mathrm{deg}_{t}(v_i)+\eta)/t.$$
  • Animation

Thm. [Bollobás, Riordan, Spencer, Tusnády '01; Dereich, Mörters '09]

Limiting degree distribution decays as power-law: for \(\tau_{m,\delta}=3+\delta/m;  \tau_\gamma=1+1/\gamma\),

$$p_k \sim k^{-\tau}.$$

Distances on PAMs

Thm. [Dereich, Mönch, Mörters '12].

For \(\tau\in(2,3)\),  \(U_t, V_t\sim \text{Unif(Largest cluster}(t))\):

$$\mathrm{dist}_{{\color{red}t}}(U_{\color{blue}t}, V_{\color{blue}t}) \asymp  4\frac{\log\log(t)}{|\log(\tau-2)|} $$

 

Thm. [Dommers, v/d Hofstad, Hooghiemstra '10].

For \(\tau>3\), \(\exists c_1, c_2>0\):

$$c_1\preccurlyeq\frac{\mathrm{dist}_{{\color{red}t}}(U_{\color{blue}t}, V_{\color{blue}t})}{\log(t)} \preccurlyeq c_2.$$

Thm. [Bollobás, Riordan '04; Dereich, Mönch, Mörters '17].

For \(\tau=3\):

$$\mathrm{dist}_{{\color{red}t}}(U_{\color{blue}t}, V_{\color{blue}t})\asymp \frac{\log(t)}{\log\log(t)}.$$

Thm. [Dereich, Mönch, Mörters '12; J., Komjáthy '20].

For \(\tau\in(2,3)\),  \(U_t, V_t\sim \text{Unif(Largest cluster}(t))\):

$$\bigg(\left|\mathrm{dist}_{{\color{red}t}}(U_{\color{blue}t}, V_{\color{blue}t}) - 4\frac{\log\log(t)}{|\log(\tau-2)|}\right|\bigg)_{t\ge 0} $$

is a tight sequence.

Literature perspective on PAMs

  • Many static/limiting properties are well-understood
    • Degree distribution.
    • Typical graph distance.
    • Diameter of the graph.
    • Local neighborhood of a typical vertex (local weak sense).
    • Component sizes.
    • ...
    • ...
  • Studying static properties allows for comparison to static (non-growing) models (configuration model, Norros-Reittu, ...).

  • Theorem statements do not display dynamics inherently present in PAMs (contrary to proofs).

    • Exception: degree evolution \((\mathrm{deg}_{t'}(v))_{t'\ge v}\) of (sets of) fixed vertices.

Goal: try to understand how the graph evolves for \(t'\ge t\) from perspective of graph at time \(t\)

Main Question

Consider \(\Big(\big(\mathrm{dist}_{{\color{red}t'}}(U_{\color{blue} t}, V_{\color{blue} t})\big)_{{\color{red}t'}\ge {\color{blue}t}}\Big)_{t\ge 0}\) and define for a function \(f(t,t')\)

$$X_t^f := \sup_{t'\ge t}\left| \mathrm{dist}_{{\color{red}t'}}(U_{\color{blue}t}, V_{\color{blue}t}) - f(t, t')\right|.$$

Q1. For \(\tau\in(2,3)\), can we identify \(f_\tau(t, t')\) s.t.

       \((X_t^{f_\tau})_{t\ge 0}\) is a tight sequence?

\(\mathrm{dist}_{t_0}(U_{t_0}, V_{t_0}) = 7,\)

\(\mathrm{dist}_{t_1}(U_{t_0}, V_{t_0}) = 6,\)

\(0\)

\(t_2\)

\(\mathrm{dist}_{t_2}(U_{t_0}, V_{t_0}) = 2\)

\(U_{t_0}\)

\(V_{t_0}\)

\(t_1\)

\(t_0\)

Q2. For \(\tau\in(2,3)\), identify \(f_\tau(t, t')\) s.t.

$$\Big(\sup_{t'\ge t}\left| \mathrm{dist}_{{\color{red}t'}}(U_{\color{blue}t}, V_{\color{blue}t}) - f_\tau(t, t')\right|\Big)_{t\ge 0}$$         is a tight sequence.

Main Theorem

Thm. [J., Komjáthy '20+].

$$f_\tau(t,t') = 4\frac{\log\log(t) - \log(1\vee\log(t'/t))}{|\log(\tau-2)|}\vee 2.$$

Corollary. Hydrodynamic limit [J., Komjáthy '20+].
For \(\varepsilon>0\), let \(t'=T_t(a):=t\exp\big(\varepsilon\log^a(t)\big)\) for \(a\in[0,1]\), then

$$f_\tau(t,T_t(a)) \approx 4(1-a)\frac{\log\log(t)}{|\log(\tau-2)|}\vee 2,$$ and

$$ \sup_{a\in[0,1]} \left| \frac{\mathrm{dist}_{T_t(a)}(U_t, V_t)}{\log\log(t)} - (1-a)\frac{4}{|\log(\tau-2)|}\right|\overset{\mathbb{P}}{\longrightarrow} 0.$$

Main Theorem

Corollary. Hydrodynamic limit [J., Komjáthy '20+].
Let \(t'=T_t(a):=t\exp\big(\log^a(t)\big)\) for \(a\in[0,1]\), then

$$ \sup_{a\in[0,1]} \left| \frac{\mathrm{dist}_{T_t(a)}(U_t, V_t)}{\log\log(t)} - (1-a)\frac{4}{|\log(\tau-2)|}\right|\overset{\mathbb{P}}{\longrightarrow} 0.$$

Heuristic upper bound

Statement. [J., Komjáthy '20+].

For \(\tau\in(2,3)\),
$$ \mathbb{P}\Big(\forall t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')+1/\varepsilon\Big) \ge 1-\varepsilon. $$

Weaker statement. [J., Komjáthy '20+].
For \(t\) large, there \(\exist\) nice \(t'\mapsto\varepsilon_{t'}\)

$$ \mathbb{P}\Big(\mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')+1/\varepsilon\Big) \ge 1-\varepsilon_{t'}. $$

"Proof" of statement by smart union bound.

  • Distance is non-increasing sequence of integers.
  • Consider \((t_i(t))_{i\ge 0}\) at integer crossings rhs (i.e., event changes).
  • By weak version, the statement follows if for \(t\) large \(\sum_i \varepsilon_{t_i(t)} \leq \varepsilon\).
  • Weak statement follows from minor adaptations of [Dommers, v/d Hofstad, Hooghiemstra '10; Dereich, Mönch, Mörters '12; Caravenna, Garavaglia, v/d Hofstad '19; J., Komjáthy '20].

Unfortunately \(\int_t^\infty \varepsilon_{t'} \mathrm{d}t'\to \infty.\)

Naive idea: Union bound.

Heuristic upper bound

Weaker statement. [J., Komjáthy '20+].
For \(t\) large, there \(\exist\) nice \(t'\mapsto\varepsilon_{t'}\)

$$ \mathbb{P}\Big(\mathrm{dist}_{t'}(U_t, V_t)\le f_{\tau}(t,t') + 1/\varepsilon\Big) \ge 1-\varepsilon_{t'}. $$

Step 0, \(t'=t\):

\(^{(t)}\)

\(^{(t')}\)

\(^{(t')}\)

\(^{(t')}\)

\(^{(t')}\)

\(^{(t')}\)

\(^{(t')}\)

bounded-length segments

Heuristic upper bound

Weaker statement. [J., Komjáthy '20+].
For \(t\) large, there \(\exist\) nice \(t'\mapsto\varepsilon_{t'}\)

$$ \mathbb{P}\Big(\mathrm{dist}_{t'}(U_t, V_t)\le f_{\tau}(t,t') + 1/\varepsilon\Big) \ge 1-\varepsilon_{t'}. $$

Step 0, \(t'=t\):

\(\mathrm{deg}_t(\cdot)\)

\(\mathrm{Core}(t)\)

  • \(\text{Core}(t'):=\{v\le t': \text{deg}_{(1-\varepsilon)t'}(v)\ge \sqrt{t'/\log(t')}\}\).
  • If \(\mathrm{deg}_{(1-\varepsilon)t'}(x)=s: \exists y: \mathrm{deg}_{(1-\varepsilon)t'}(y)\approx s^{1/(\tau-2)}, \{x\overset{2}{\leftrightarrow} y\}_{t'}\).
  • # of iterations needed to reach core: smallest \(k_t\) s.t.
    $$\big(\mathrm{deg}_{t'}(q_{t_0})\big)^{1/(\tau-2)^k} \ge \sqrt{t/\log(t)}.$$
  • \(k_t\approx\log\log(t)/|\log(\tau-2)|=\frac{1}{4}f_\tau(t,t)\) iterations for \(U_t, V_t\).

Heuristic upper bound

Weaker statement. [J., Komjáthy '20+].
For \(t\) large, there \(\exist\) nice \(t'\mapsto\varepsilon_{t'}\)

$$ \mathbb{P}\Big(\mathrm{dist}_{t'}(U_t, V_t)\le f_{\tau}(t,t') + 1/\varepsilon\Big) \ge 1-\varepsilon_{t'}. $$

Step 1, \(t'>t\):

\(\mathrm{deg}_t(\cdot)\)

\(\mathrm{Core}(t)\)

\(\mathrm{deg}_{t'}(\cdot)\)

\(\mathrm{Core}(t')\)

  • \(\text{Core}(t'):=\{v\le t': \text{deg}_{(1-\varepsilon)t'}(v)\ge \sqrt{t'/\log(t')}\}\).
  • If \(\mathrm{deg}_{(1-\varepsilon)t'}(x)=s: \exists y: \mathrm{deg}_{(1-\varepsilon)t'}(y)\approx s^{1/(\tau-2)}, \{x\overset{2}{\leftrightarrow} y\}_{t'}\).
  • # of iterations needed: smallest \(k_{t'}\) s.t.
    $$\big(\mathrm{deg}_{t'}(q_{t_0})\big)^{1/(\tau-2)^k} \ge \sqrt{t'/\log(t')}.$$
  • Competing effects:
    (1) Core threshold increases;
    (2) degree increases.
  • Strong control of \(\text{deg}_{t'}(q_{t,0})\):
    Using Móri-martingale, Doob's maximal inequality:
    \(\text{deg}_{t'}(q_{t,0})\gtrsim (t'/t)^{1/(\tau-1)}\) for all \(t'>t\)
    .
  • \(k_{t'}\le \frac{1}{4}f_\tau(t,t')\).
  • If \(t'=T_t(a)\), # iterations is linear in \(a\)

Heuristic lower bound

Statement. [J., Komjáthy '20+]. For \(t\) sufficiently large
$$ \mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) \le \varepsilon. $$

2. Bound rhs using 'old' methods [Dereich, Mönch, Mörters '12].

Naive guess (similar to upper bound):

1. Find sequence \((t_i)_{i\ge 0}\) at which event changes. Then

$$ \mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) \leq \sum_i \mathbb{P}\Big(\mathrm{dist}_{t_i}(U_t, V_t)\le f_\tau(t,t_i)-1/\varepsilon\Big).$$

RHS not summable, new machinery needed

Heuristic lower bound

Statement. [J., Komjáthy '20+]. For \(t\) sufficiently large
$$ \mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) \le \varepsilon. $$

Main idea: Exploit time-dependencies in the growing graph

There is a first time of failure.

Heuristic lower bound

Statement. [J., Komjáthy '20+]. For \(t\) sufficiently large
$$ \mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) \le \varepsilon. $$

*In fact, we need a more advanced separation of events (using good and bad paths, inspired by [Dereich, Mönch, Mörters '12]) that makes the decomposition more interlinked.

\(\leq^\ast C/(t'\log^3(t'))\)

Let* \(\mathcal{E}(t,t'):= \{\mathrm{dist}_{t'}(U_t, V_t)\ge f_\tau(t,t')-1/\varepsilon\}\). Recall \(f_\tau(t,t')\) is non-increasing. Then

\(\mathbb{P}\Big(\neg\mathcal{E}(t,t') \mid \bigcap_{\tilde{t}=t}^{t'-1} \mathcal{E}(t,\tilde{t}) \Big)\le\mathbb{P}\big(\)there is a short path that traverses \(t'\big)\)

Main idea: Exploit time-dependencies in the growing graph

There is a first time of failure.

$$\mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) = \mathbb{P}\big(\neg \mathcal{E}(t,t)\big) + \sum_{t'=t+1}^\infty\mathbb{P}\bigg(\neg\mathcal{E}(t,t') \mid \bigcap_{\tilde{t}=t}^{t'-1} \mathcal{E}(t,\tilde{t}) \bigg)$$

$$\mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) \leq \varepsilon + \sum_{t'=t+1}^\infty C/(t'\log^3(t')) \overset{t\to\infty}\longrightarrow \varepsilon$$

\(\vdots\)

Many computations,

inductive proofs, ...

Main Theorem (extended)

Weighted-distance evolution. [J., Komjáthy '20+].
Let \(\tau\in(2,3)\). Equip every edge with i.i.d. \(L_e\ge 0\), \(U_t, V_t\sim \text{Unif(Largest cluster}(t))\). If If \(F_L^{(-1)}\) is not too flat, then we identify \(f_{\tau}^{(L)}(t,t')\) s.t.

$$\Big(\sup_{t'\ge t}|d^{(L)}_{t'}(U_t, V_t) - f_{\tau}^{(L)}(t,t')|\Big)_{t\geq 0}$$

forms a tight sequence of random variables

Def. Weighted distance: \(d_{t'}^{(L)}(x, y):= \min_{\pi \text{ from }x\text{ to }y \text{ at time }t'} \sum_{e\in\pi} L_e\)

Weighted setting. Choose your favourite non-negative random variable \(L_e\) and equip every edge upon creation with a non-negative copy of \(L_e\).

arXiv: tiny.cc/DistanceEvolutionPAM

Comment. Trivial statement for most distributions with support starting at the origin: Already \(d_t^{(L)}(U_t,V_t)\) is of constant order. [J., Komjáthy '20]

Outlook and Invitation

What about (other) properties in (other/spatial) versions of the model?

Thank you!

Distance evolutions (Franco-Dutch)

By joostjor

Distance evolutions (Franco-Dutch)

  • 248