Distance Evolutions in
Preferential Attachment Models
Joost Jorritsma
joint with Júlia Komjáthy
Oberwolfach Workshop
January 2021
arXiv: tiny.cc/DistanceEvolutionPAM
Guiding 'real-world' example
Internet network: communicating routers and servers
~1969: 2 connected sites (UCLA, SRI)
Time
~1971: 15 connected sites
~1989: 0.5 million users
~2020: billions of connected devices
- Evolving/dynamic network:
- Vertices arrive over time.
- Connect to present vertices.
- (Nodes may be removed or replaced.)
- (Edges between 'old' vertices are removed or placed.)
- [Faloutsos, Faloutsos & Faloutsos, '99]:
- # connections per router decays as power-law: \(p_k\sim k^{-\tau}, \tau>2\).
- Short average hopcount between routers.
Guiding 'real-world' example
Internet network: communicating routers and servers
- Evolving network: Vertices arrive over time, connect to present vertices.
- [Faloutsos, Faloutsos & Faloutsos, '99]: Short average hopcount.
1999
\(\mathrm{dist}_{\color{red}{'99}}(u_{'99}, v_{'99}) = 4\)
2005
\(\mathrm{dist}_{{\color{red}'05}}(u_{'99}, v_{'99}) = 3\)
2021
\(\mathrm{dist}_{{\color{red}'21}}(u_{'99}, v_{'99}) = 2\)
- How does hopcount between old nodes change in growing network?
\(u_{'99}\)
\(v_{'99}\)
Preferential attachment models
Definition
- Start with a single vertex
- Vertices enter the network one-by-one at discrete (time-)steps \(t=2, 3,...\).
- New vertex connects to old vertices according to some increasing function of the degree:
- Fixed outdegree: \((m,\delta): m\) edges/new vertex (\(\delta>-m\)):
$$\mathbb{P}\big(v_{t+1} \overset{j}{\longrightarrow} v_i\big) \propto \mathrm{deg}_{t,j}(v_i) + \delta/m.$$ - Variable outdegree: \(\forall v_i \in \mathrm{PA}_t\), independently, (\(\gamma,\eta\in(0,1)\))
$$\mathbb{P}\big(v_{t+1} \longrightarrow v_i\big) = (\gamma \mathrm{deg}_{t}(v_i)+\eta)/t.$$
- Fixed outdegree: \((m,\delta): m\) edges/new vertex (\(\delta>-m\)):
- Animation
Thm. [Bollobás, Riordan, Spencer, Tusnády '01; Dereich, Mörters '09]
Limiting degree distribution decays as power-law: for \(\tau_{m,\delta}=3+\delta/m; \tau_\gamma=1+1/\gamma\),
$$p_k \sim k^{-\tau}.$$
Distances on PAMs
Thm. [Dereich, Mönch, Mörters '12].
For \(\tau\in(2,3)\), \(U_t, V_t\sim \text{Unif(Largest cluster}(t))\):
$$\mathrm{dist}_{{\color{red}t}}(U_{\color{blue}t}, V_{\color{blue}t}) \asymp 4\frac{\log\log(t)}{|\log(\tau-2)|} $$
Thm. [Dommers, v/d Hofstad, Hooghiemstra '10].
For \(\tau>3\), \(\exists c_1, c_2>0\):
$$c_1\preccurlyeq\frac{\mathrm{dist}_{{\color{red}t}}(U_{\color{blue}t}, V_{\color{blue}t})}{\log(t)} \preccurlyeq c_2.$$
Thm. [Bollobás, Riordan '04; Dereich, Mönch, Mörters '17].
For \(\tau=3\):
$$\mathrm{dist}_{{\color{red}t}}(U_{\color{blue}t}, V_{\color{blue}t})\asymp \frac{\log(t)}{\log\log(t)}.$$
Thm. [Dereich, Mönch, Mörters '12; J., Komjáthy '20].
For \(\tau\in(2,3)\), \(U_t, V_t\sim \text{Unif(Largest cluster}(t))\):
$$\bigg(\left|\mathrm{dist}_{{\color{red}t}}(U_{\color{blue}t}, V_{\color{blue}t}) - 4\frac{\log\log(t)}{|\log(\tau-2)|}\right|\bigg)_{t\ge 0} $$
is a tight sequence.
Literature perspective on PAMs
-
Many static/limiting properties are well-understood
- Degree distribution.
- Typical graph distance.
- Diameter of the graph.
- Local neighborhood of a typical vertex (local weak sense).
- Component sizes.
- ...
- ...
-
Studying static properties allows for comparison to static (non-growing) models (configuration model, Norros-Reittu, ...).
-
Theorem statements do not display dynamics inherently present in PAMs (contrary to proofs).
- Exception: degree evolution \((\mathrm{deg}_{t'}(v))_{t'\ge v}\) of (sets of) fixed vertices.
Goal: try to understand how the graph evolves for \(t'\ge t\) from perspective of graph at time \(t\)
Main Question
Consider \(\Big(\big(\mathrm{dist}_{{\color{red}t'}}(U_{\color{blue} t}, V_{\color{blue} t})\big)_{{\color{red}t'}\ge {\color{blue}t}}\Big)_{t\ge 0}\) and define for a function \(f(t,t')\)
$$X_t^f := \sup_{t'\ge t}\left| \mathrm{dist}_{{\color{red}t'}}(U_{\color{blue}t}, V_{\color{blue}t}) - f(t, t')\right|.$$
Q1. For \(\tau\in(2,3)\), can we identify \(f_\tau(t, t')\) s.t.
\((X_t^{f_\tau})_{t\ge 0}\) is a tight sequence?
\(\mathrm{dist}_{t_0}(U_{t_0}, V_{t_0}) = 7,\)
\(\mathrm{dist}_{t_1}(U_{t_0}, V_{t_0}) = 6,\)
\(0\)
\(t_2\)
\(\mathrm{dist}_{t_2}(U_{t_0}, V_{t_0}) = 2\)
\(U_{t_0}\)
\(V_{t_0}\)
\(t_1\)
\(t_0\)
Q2. For \(\tau\in(2,3)\), identify \(f_\tau(t, t')\) s.t.
$$\Big(\sup_{t'\ge t}\left| \mathrm{dist}_{{\color{red}t'}}(U_{\color{blue}t}, V_{\color{blue}t}) - f_\tau(t, t')\right|\Big)_{t\ge 0}$$ is a tight sequence.
Main Theorem
Thm. [J., Komjáthy '20+].
$$f_\tau(t,t') = 4\frac{\log\log(t) - \log(1\vee\log(t'/t))}{|\log(\tau-2)|}\vee 2.$$
Corollary. Hydrodynamic limit [J., Komjáthy '20+].
Let \(t'=T_t(a):=t\exp\big(\log^a(t)\big)\) for \(a\in[0,1]\), then
$$f_\tau(t,T_t(a)) = 4(1-a)\frac{\log\log(t)}{|\log(\tau-2)|}\vee 2,$$ and
$$ \sup_{a\in[0,1]} \left| \frac{\mathrm{dist}_{T_t(a)}(U_t, V_t)}{\log\log(t)} - (1-a)\frac{4}{|\log(\tau-2)|}\right|\overset{\mathbb{P}}{\longrightarrow} 0.$$
Main Theorem
Corollary. Hydrodynamic limit [J., Komjáthy '20+].
Let \(t'=T_t(a):=t\exp\big(\log^a(t)\big)\) for \(a\in[0,1]\), then
$$ \sup_{a\in[0,1]} \left| \frac{\mathrm{dist}_{T_t(a)}(U_t, V_t)}{\log\log(t)} - (1-a)\frac{4}{|\log(\tau-2)|}\right|\overset{\mathbb{P}}{\longrightarrow} 0.$$
Heuristic upper bound
Statement. [J., Komjáthy '20+].
For \(\tau\in(2,3)\),
$$ \mathbb{P}\Big(\forall t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')+1/\varepsilon\Big) \ge 1-\varepsilon. $$
Weaker statement. [J., Komjáthy '20+].
For \(t\) large, there \(\exist\) nice \(t'\mapsto\varepsilon_{t'}\)
$$ \mathbb{P}\Big(\mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')+1/\varepsilon\Big) \ge 1-\varepsilon_{t'}. $$
"Proof" of statement by smart union bound.
- Consider \((t_i(t))_{i\ge 0}\) at integer crossings rhs (i.e., event changes).
- Distance is non-increasing.
- By weak version, the statement follows if for \(t\) large \(\sum_i \varepsilon_{t_i(t)} \leq \varepsilon\).
- Weak statement follows from minor adaptations of [Dommers, v/d Hofstad, Hooghiemstra '10; Dereich, Mönch, Mörters '12; Caravenna, Garavaglia, v/d Hofstad '19; J., Komjáthy '20].
Unfortunately \(\int_t^\infty \varepsilon_{t'} \mathrm{d}t'\to \infty.\)
Naive idea: Union bound.
Heuristic upper bound
Weaker statement. [J., Komjáthy '20+].
For \(t\) large, there \(\exist\) nice \(t'\mapsto\varepsilon_{t'}\)
$$ \mathbb{P}\Big(\mathrm{dist}_{t'}(U_t, V_t)\le f_{\tau}(t,t') + 1/\varepsilon\Big) \ge 1-\varepsilon_{t'}. $$
Step 0, \(t'=t\):
\(^{(t)}\)
\(^{(t')}\)
\(^{(t')}\)
\(^{(t')}\)
\(^{(t')}\)
\(^{(t')}\)
\(^{(t')}\)
bounded-length segments
Heuristic upper bound
Weaker statement. [J., Komjáthy '20+].
For \(t\) large, there \(\exist\) nice \(t'\mapsto\varepsilon_{t'}\)
$$ \mathbb{P}\Big(\mathrm{dist}_{t'}(U_t, V_t)\le f_{\tau}(t,t') + 1/\varepsilon\Big) \ge 1-\varepsilon_{t'}. $$
Step 0, \(t'=t\):
\(\mathrm{deg}_t(\cdot)\)
\(\mathrm{Core}(t)\)
- \(\text{Core}(t'):=\{v\le t': \text{deg}_{(1-\varepsilon)t'}(v)\ge \sqrt{t'/\log(t')}\}\).
- If \(\mathrm{deg}_{(1-\varepsilon)t'}(x)=s: \exists y: \mathrm{deg}_{(1-\varepsilon)t'}(y)\approx s^{1/(\tau-2)}, \{x\overset{2}{\leftrightarrow} y\}_{t'}\).
- # of iterations needed to reach core: smallest \(k_t\) s.t.
$$\big(\mathrm{deg}_{t'}(q_{t_0})\big)^{1/(\tau-2)^k} \ge \sqrt{t/\log(t)}.$$ - \(k_t\approx\log\log(t)/|\log(\tau-2)|=\frac{1}{4}f_\tau(t,t)\) iterations for \(U_t, V_t\).
Heuristic upper bound
Weaker statement. [J., Komjáthy '20+].
For \(t\) large, there \(\exist\) nice \(t'\mapsto\varepsilon_{t'}\)
$$ \mathbb{P}\Big(\mathrm{dist}_{t'}(U_t, V_t)\le f_{\tau}(t,t') + 1/\varepsilon\Big) \ge 1-\varepsilon_{t'}. $$
Step 1, \(t'>t\):
\(\mathrm{deg}_t(\cdot)\)
\(\mathrm{Core}(t)\)
\(\mathrm{deg}_{t'}(\cdot)\)
\(\mathrm{Core}(t')\)
- \(\text{Core}(t'):=\{v\le t': \text{deg}_{(1-\varepsilon)t'}(v)\ge \sqrt{t'/\log(t')}\}\).
- If \(\mathrm{deg}_{(1-\varepsilon)t'}(x)=s: \exists y: \mathrm{deg}_{(1-\varepsilon)t'}(y)\approx s^{1/(\tau-2)}, \{x\overset{2}{\leftrightarrow} y\}_{t'}\).
- # of iterations needed: smallest \(k_{t'}\) s.t.
$$\big(\mathrm{deg}_{t'}(q_{t_0})\big)^{1/(\tau-2)^k} \ge \sqrt{t'/\log(t')}.$$ - Competing effects:
(1) Core threshold increases;
(2) degree increases. -
Strong control of \(\text{deg}_{t'}(q_{t,0})\):
Using Móri-martingale, Doob's maximal inequality:
\(\text{deg}_{t'}(q_{t,0})\gtrsim (t'/t)^{1/(\tau-1)}\) for all \(t'>t\) . - \(k_{t'}\le \frac{1}{4}f_\tau(t,t')\).
- If \(t'=T_t(a)\), # iterations is linear in \(a\)
Heuristic lower bound
Statement. [J., Komjáthy '20+]. For \(t\) sufficiently large
$$ \mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) \le \varepsilon. $$
2. Bound rhs using 'old' methods [Dereich, Mönch, Mörters '12].
Naive guess (similar to upper bound):
1. Find sequence \((t_i)_{i\ge 0}\) at which event changes. Then
$$ \mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) \leq \sum_i \mathbb{P}\Big(\mathrm{dist}_{t_i}(U_t, V_t)\le f_\tau(t,t_i)-1/\varepsilon\Big).$$
RHS not summable, new machinery needed
Heuristic lower bound
Statement. [J., Komjáthy '20+]. For \(t\) sufficiently large
$$ \mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) \le \varepsilon. $$
Main idea: Exploit time-dependencies in the growing graph
There is a first time of failure.
Heuristic lower bound
Statement. [J., Komjáthy '20+]. For \(t\) sufficiently large
$$ \mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) \le \varepsilon. $$
*In fact, we need a more advanced separation of events (using good and bad paths, inspired by [Dereich, Mönch, Mörters '12]) that makes the decomposition more interlinked.
\(\leq^\ast C/(t'\log^3(t'))\)
Let* \(\mathcal{E}(t,t'):= \{\mathrm{dist}_{t'}(U_t, V_t)\ge f_\tau(t,t')-1/\varepsilon\}\). Recall \(f_\tau(t,t')\) is non-increasing. Then
\(\mathbb{P}\Big(\neg\mathcal{E}(t,t') \mid \bigcap_{\tilde{t}=t}^{t'-1} \mathcal{E}(t,\tilde{t}) \Big)\le\mathbb{P}\big(\)there is a short path that traverses \(t'\big)\)
Main idea: Exploit time-dependencies in the growing graph
There is a first time of failure.
$$\mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) = \mathbb{P}\big(\neg \mathcal{E}(t,t)\big) + \sum_{t'=t+1}^\infty\mathbb{P}\bigg(\neg\mathcal{E}(t,t') \mid \bigcap_{\tilde{t}=t}^{t'-1} \mathcal{E}(t,\tilde{t}) \bigg)$$
$$\mathbb{P}\Big(\exist t'\ge t: \mathrm{dist}_{t'}(U_t, V_t)\le f_\tau(t,t')-1/\varepsilon\Big) \leq \varepsilon + \sum_{t'=t+1}^\infty C/(t'\log^3(t')) \overset{t\to\infty}\longrightarrow \varepsilon$$
\(\vdots\)
9 pages of computations, inductive proves, ...
Outlook and Invitation
What about (other) properties in (other/spatial) versions of the model?
Thank you!
Weighted version. [J., Komjáthy '20+].
Let \(\tau\in(2,3)\). Equip every edge with i.i.d. \(L_e\ge 0\), \(U_t, V_t\sim \text{Unif(Largest cluster}(t))\). If If \(F_L^{(-1)}\) is not too flat, then we identify \(f_{\tau}^{(L)}(t,t')\) s.t.
$$\Big(\sup_{t'\ge t}|d^{(L)}_{t'}(U_t, V_t) - f_{\tau}^{(L)}(t,t')|\Big)_{t\geq 0}$$
forms a tight sequence of random variables
Def. Weighted distance: \(d_{t'}^{(L)}(x, y):= \min_{\pi \text{ from }x\text{ to }y \text{ at time }t'} \sum_{e\in\pi} L_e\)
Thank you!
Weighted version. [J., Komjáthy '20+].
Let \(\tau\in(2,3)\). Equip every edge with i.i.d. \(L_e\ge 0\), \(U_t, V_t\sim \text{Unif(Largest cluster}(t))\). If If \(F_L^{(-1)}\) is not too flat, then we identify \(f_{\tau}^{(L)}(t,t')\) s.t.
$$\Big(\sup_{t'\ge t}|d^{(L)}_{t'}(U_t, V_t) - f_{\tau}^{(L)}(t,t')|\Big)_{t\geq 0}$$
forms a tight sequence of random variables
Def. Weighted distance: \(d_{t'}^{(L)}(x, y):= \min_{\pi \text{ from }x\text{ to }y \text{ at time }t'} \sum_{e\in\pi} L_e\)
arXiv: tiny.cc/DistanceEvolutionPAM
Distance evolutions
By joostjor
Distance evolutions
- 322