Nicolas Underwood
Nov 26, 2021
PI: Fabien Paillusson
Research Project Grant 2021:
Assessing ergodicity in physical systems and beyond
Suppose I've invented a new medicine.
The upside: If you take it there's a 99% chance it will add 5 years to you life.
The downside: There's a 1% chance it will kill you.
Would you take it? ....And if so, would you take it multiple times?
Ensemble average: If we mandated the entire population of a country take it, then the total number of years lived increases, which appears good.
Time average: But if you kept taking it again and again, it will eventually kill you, which isn't so good.
Questions like this are the subject of decision theory, and often misunderstood and misrepresented, and seemingly a bit of a hot topic in economics. The heart of the question appears to be the disagreement between ensemble and time averages, or in other words, ergodicity.
Physicists
Ergodic theory
Mathematicians
Economists, Medics, and social scientists
A much debated correspondence
Two Contentious Arguments
The subject of this new project and of this seminar
Central to another project I have ongoing and of the next seminar
If let run for an infinite time, a single trajectory constrained on a constant energy surface would fill the entirety of the available phase space.
$$\begin{matrix}\text{The infinite time average of function f}\\\text{ over a single trajectory}\end{matrix}=:\hat{f}\,\,=\,\,\bar{f}:=\begin{matrix}\text{The equilibrium ensemble}\\\text{ average of function f}\end{matrix}$$
Consequently:
Establishes:
Boltzmann appears to have held a "resolutely finitist" view of the energy surface, viewing it as divisible into a countable number of cells, which would be visited equally by a single trajectory.
and also:
The independence of the time average on initial conditions.
Birkhoff
von Neumann
Tatyana
Afanasyeva
Paul Ehrenfest
Khinchin
Boltzmann
Ergodicity regarded as a property of the dynamics resulting in \(\hat{f}=\bar{f}\)
A rough sketch of how the notion of ergodicity changed over time
Certain functions can be ergodic, and others not so. Ergodicity is then a property of both the dynamics and the function being averaged. The test of ergodicity is whether \(\hat{f}=\bar{f}\)
The property that \(\hat{f}=\bar{f}\) became the accepted, pragmatic definition of ergodicity, rather than its consequence
Birkhoff
von Neumann
Tatyana
Afanasyeva
Paul Ehrenfest
Khinchin
Boltzmann
Here we will first cover Metric Transitivity as it helps to introduce some of the formal mathematical background of the subject. We'll then touch on Khinchin's Self-Similarity argument so as to contrast the quite stark difference of its formulation, before moving on to our proposed (much more operational) approach to the question of ergodicity.
Probability space
We begin with a Kolmogorov style probability space, the tuple
$$(\Omega, \sigma, \mu),$$
where
A probability space is a measure space in which the measure of the whole space is unity, \(\mu(\Omega)=1\).
Measure preserving dynamics
We add to the probability space a measure preserving dynamics \(T\),
$$(\Omega, \sigma, \mu, T).$$
The measure is preserved by the dynamics in the sense that for all \(\omega\in \sigma \),
$$\mu(T(\omega)) = \mu(\omega).$$
Although it is possible to consider other circumstances (more discussion on this as we go), following on from its origins in classical mechanics, ergodic theory as a discipline is usually introduced in terms of a measure preserving dynamics.
So we'll start there:
The measure maps elements of the \(\sigma\) onto probabilities,
$$\mu: \sigma \rightarrow [0,1],$$
so that if \(\Omega\) is spanned by coords \(x\in\Omega\), and \(\omega\) is a member of the sigma algebra, \(\omega\in\sigma\), then
$$ \mu: \omega \mapsto \mu(\omega)=\int_\omega d\mu =\int_\omega \mu(x)dx=\begin{matrix}\text{proportional of}\\ \text{total states in }\omega\end{matrix} \in [0,1]$$
In simple terms, \(\mu(x)=\) the density of states at coordinates \(x\).
Classical mechanics preserves the Liouville measure - simply the volume of the phase space region.
This means, for instance, that a uniform distribution remains uniform
How physicists tend to think of the measure
Reasons to account for a non-uniform measure
We already have seen that the measure of any volume is conserved by the dynamics
$$\mu(T(\omega)) = \mu(\omega).$$
Definition (Invariant Volume)
An invariant volume \(\omega_\text{inv}\) is a region of the state space that is unchanged by the dynamics
$$T(\omega_\text{inv}) = \omega_\text{inv}$$
The whole space is an invariant volume
The path followed by a single trajectory is an invariant volume (albeit of measure zero)
Definition (Metrically Indecomposable Invariant Volume)
A metrically indecomposable (MI) volume, \(\omega_\text{MI}\), is a invariant volume that cannot be decomposed into two smaller invariant volumes
$$\omega_\text{MI}\neq \omega_\text{1,inv}+\omega_\text{2,inv}$$
A very crudely sketched out invariant volume
In this case the volume is not MI, as we could divide into smaller invariant volumes
Theorem (Birkhoff I)
The time average of a function over a trajectory exists in in the long time limit,
$$\hat{f}(x) := \lim_{t\rightarrow\infty} \frac{1}{t} \int_0^t f(T(x,t'))dt'$$
exists "almost everywhere" in \(\Omega\). (We'll state this without proof.)
Note: this means we should regard infinite time averages as functions of entire trajectories.
Lemma
The infinite time average \(\hat{f}(x)\) is constant almost everywhere on an MI volume, \(\omega_\text{inv}\).
Proof
We already have seen that the measure of any volume is conserved by the dynamics
$$\mu(T(\omega)) = \mu(\omega).$$
Definition (Invariant Volume)
An invariant volume \(\omega_\text{inv}\) is a region of the state space that is unchanged by the dynamics
$$T(\omega_\text{inv}) = \omega_\text{inv}$$
The whole space is an invariant volume
The path followed by a single trajectory is an invariant volume (albeit of measure zero)
Definition (Metrically Indecomposable Invariant Volume)
A metrically indecomposable (MI) volume, \(\omega_\text{MI}\), is a invariant volume that cannot be decomposed into two smaller invariant volumes
$$\omega_\text{MI}\neq \omega_\text{1,inv}+\omega_\text{2,inv}$$
Definition - Metric Transitivity (MT)
A system for which almost the entire space \(\Omega\) is metrically indecomposable is called Metrically Transitive (MT).
Theorem (Birkhoff II - Metric Transitivity)
A system which is MT satisfies
$$\hat{f}(x)=\bar{f}(x)$$
for almost all states \(x\in\Omega\).
Proof
MT is a rigorous notion of ergodicity, very much in the spirit of Boltzmann, and which guarantees \(\hat{f}=\bar{f}\). So are familiar systems MT?
Is the simple Hamiltonian system I've been using as an example MT?
No..... Trajectories follow paths upon closed surfaces of constant energy. Any interval between two energies is an invariant volume, and these may be divided ad nauseam, so not MT.
What if we considered only systems of a given energy?
Well..... Restricting to a microcanonical (MC) ensemble, with implied measure \(\mu_\text{MC}\), would indeed make this particular system MT.
However....
\(E_1\)
\(E_2\)
...Even the smallest change to the potential breaks this in an interesting/complicated way.
The chief qualitative difference between this system and the last is that I have slightly modified the potential to add this local minimum here.
This changes the energy surfaces so that between \(\sim -0.6 \) and \(\sim 0.0\) they become disconnected.
Above these energies MT \(=\checkmark\)
Below these energies MT \(=\checkmark\)
Between the energies MT \(=\times\)
(....and clearly \(\hat{f}\neq\bar{f}\) for an arbitrary \(f\).)
Aleksander Khinchin
Resulting Inequality
$$\text{Prob}\left(\frac{|\hat{f}-\bar{f}|}{|\bar{f}|}\geq K_1 N^{-1/4}\right)\leq K_2N^{-1/4},$$
where \(K_1\) and \(K_2\) are \(O(1)\). In short, as a system grows larger, the probability of finding a deviation from ergodicity beyond a certain size shrinks, vanishing in the thermodynamical limit.
Assumptions of Khinchin's scheme:
Interactions would break this condition. Khinchin argued that a physical Hamiltonian need only be approximately seperable.
For instance kinetic energy, or pressure, but suffers similar conceptual problem with interaction potentials.
Requires an infinite time
(no experiment is run for an infinite time)
Khinchin's Self-Similarity Argument
Birkhoff's Metric Transitivity
Only applies to very large systems - many interesting systems are relatively small
May apply (with difficulty) to short range interactions, but doesn't apply to long range interactions
Only applies to specific additive functions
Aside from simple model systems (e.g. hard spheres in a box) very difficult to assess/prove for realistic systems with non-singular interactions
Doesn't allow us to assess or measure lack of ergodicity (meaning \(\hat{f}\neq\bar{f}\)), or prove it's absence
...or, how can we add to this picture?
Defining a practical notion of ergodicity
The commonly held notion of ergodicity is simply the equality of the two averages,
$$\hat{f}:=\lim_{t\rightarrow\infty}\int_0^t f\left(T(x,t)\right)\mathrm{d}t,\quad \quad \bar{f}:=\frac{1}{\mu(\Omega)}\int_\Omega f \mathrm{d}\mu$$
Empirical cumulative distribution functions (eCDFs)
We can translate samples \(f_t\) and \(f_e\) into eCDFs,
$$F^n_t(f_t):=\frac{1}{n}\sum_i^n\Theta(f^{(i)}_t), \quad F^n_e(f_e):=\frac{1}{n}\sum_i^n\Theta(f^{(i)}_e),$$
where \(\Theta\) is the Heaviside step function.
For example - Here is a randomly sampled normal distribution:
eCDFs make fiinite jumps of \(1/n\) at each value in the sample.
Empirical cumulative distribution functions (eCDFs)
We can translate samples \(f_t\) and \(f_e\) into eCDFs,
$$F^n_t(f_t):=\frac{1}{n}\sum_i^n\Theta(f^{(i)}_t), \quad F^n_e(f_e):=\frac{1}{n}\sum_i^n\Theta(f^{(i)}_e),$$
where \(\Theta\) is the Heaviside step function.
Assessing the similarity of two distributions
To assess the difference between two distributions can use the Kolmogorov-Smirnov (KS) metric,
$$d_K(A,B):=\text{sup}_x\left|A(x)-B(x)\right|,$$
which is the maximum "vertical" distance between distributions.
For eCDFs of equal sample size \(n\) the KS metric takes quantised values \(d_K=m/n\) for integer \(m\) between 0 and n.
This makes KS distances of eCDFs very quick to assess computationally.
Convergence of \(d_K(F^n_t,F^n_e)\)
$$d_K(A,C)\leq d_k(A,B)+d_K(B,C)$$
What is it possible to say about about \(d_K(F_t,F_e)\)?
While it isn't possible to know \(d_K(F_t,F_e)\) from finite samples, we can still try inquire about it, for instance by,
Uniform distribution
Normal distribution
Kolmogorov-Smirnov distribution
Remarkably, this is true regardless of the underlying distribution!
This allows us to hypothesis test using IE as the null hypothesis.
A figure made by Fabien showing convergence to the KS distribution with increasing \(n\)
The equivalent figures made by me with a Mathematica code
Standard hypothesis testing algorithm
Kolmogorov-Smirnov distribution
Remarkably, this is true regardless of the underlying distribution!
This allows us to hypothesis test using IE as the null hypothesis.
A plot from that paper.
Another figure I've borrowed from Fabien:
Preliminary analysis of the mean contact number on a \(10\times 10\) lattice at different densities.
The Kob-Anderson model is a discrete lattice model introduced to study glassy phase transitions through ergodicity. A particle is permitted to jump between sites on if it has no more than \(m\) occupied neighbours before and after the move.
(Note: You may have to forgive me here. At this point it is not just the logic, but my understanding that becomes fuzzy...)
It may be wasteful to simply reject a hypothesis of ergodicity once a somewhat arbitrary cut-off point has been reached. The system in question may still be approximately ergodic and approximate statistical mechanics may follow. Instead we might think to tailor a new definition of ergodicity based upon fuzzy logic.
Fuzzy logic applied to Statistical Hypothesis Testing
The familiar law of Modus Tollens (contraposition) from classical logic says that
$$\text{If }A\implies B\text{ then } \lnot B\implies \lnot A$$
In the context of hypothesis testing, the uncertainties involved means this does not carry directly over,
$$P(E|H)<0.01 \neq P(\lnot H| E)>99\%$$
Matt Booth and Fabien have a paper published earlier this year on how Fuzzy logic may be used to address this issue.
A figure from Booth, Paillusson, A Fuzzy Take on the Logical Issues of Statistical Hypothesis Testing. Philosophies 2021, 6, 21.
\(r\)-fuzzy limits
$$\underset{n\rightarrow\infty}{\text{Fuzlim}}S_n=a \text{ iff }\forall \epsilon>0\,\,\exists n>n_\epsilon \text{ s.t. } |S_n-a|\leq r+\epsilon$$
\(r\)-fuzzy convergence
A sequence \(s(t)\) is called \(r\)-fuzzy convergent wrt. norm \(||\cdot||\) if for any \(\epsilon>0\) \(\exists\) \(t_\epsilon\) s.t. for all \(k\) we have
$$||s(t),s(t+k)||\leq r_\epsilon \text{ if } t>t_\epsilon.$$
KS distance
Triangle identity for upper limit
(make sure you understand that monotonic decline argument)
GC theorem
Animations of explicit calculations
$$\bar{f}:=\lim_{t\rightarrow\infty}\int_0^t f\left(T(x,t)\right)\mathrm{d}t\quad = \quad\frac{1}{\mu(X)}\int_X f \mathrm{d}\mu =: \left< f \right>$$
Unitary Evolution
\( \psi\) obeys standard unitary Schrödinger evolution. Evolution of system configuration \(x(t)\) is determined by finding a current \(j(q,t)\) consistent with
$$\nabla\cdot j(x,t)=-\frac{\partial |\psi(x,t)|^2}{\partial t},$$
and inferring the law of evolution
$$\dot{x}=\frac{j(x,t)}{|\psi(x,t)|^2}.$$
Once one solution is found , others follow by adding an incompressible current, \({\nabla.j_\text{inc}(x,t)=0}\), so that
$$\dot{x}'=\dot{x}+\frac{j_\text{inc}}{|\psi|^2}.$$
Canonically this means
Similarly, for a bosonic field:
$$\dot{\phi}(y)\sim\text{Im}\left(\frac{1}{\psi}\frac{\delta\psi}{\delta\phi(y)}\right)\quad \text{or}\quad \dot{\phi}(y)\sim\frac{\delta S}{\delta\phi(y)},$$
where \(\psi[\phi]=|\psi[\phi]|\exp(iS[\phi])\).
Non-canonical solution by Green's functions:
In an arbitrary basis \( \left. | x \right>\), for \(x\in\mathbb{R}^n\),
$$\dot{x}\sim\frac{1}{|\psi(x)|^2} \int_\Omega\mathrm{d}^nx' \frac{\widehat{\Delta x}}{|\Delta x|^{n-1}} \frac{\partial |\psi(x')|^2}{\partial t}.$$
Non-canonical solutions are also possible
For point particles \(i\):
$$\dot{q}_i =\frac{\hbar}{m_i}\text{Im}\left(\frac{\partial_{q_i} \psi}{\psi}\right)\quad\text{or}\quad \dot{q}_i = \frac{\partial_{q_i} S}{m_i},$$
where \(\psi(q_1,q_2,...)=|\psi(q_1,q_2,...)|e^{iS(q_1,q_2,...)/\hbar}\).
Unitary Evolution
\( \psi\) obeys standard unitary Schrödinger evolution. Evolution of system configuration \(x(t)\) is determined by finding a current \(j(q,t)\) consistent with
$$\nabla\cdot j(x,t)=-\frac{\partial |\psi(x,t)|^2}{\partial t},$$
and inferring the law of evolution
$$\dot{x}=\frac{j(x,t)}{|\psi(x,t)|^2}.$$
Once one solution is found , others follow by adding an incompressible current, \({\nabla.j_\text{inc}(x,t)=0}\), so that
$$\dot{x}'=\dot{x}+\frac{j_\text{inc}}{|\psi|^2}.$$
Canonically this means
Similarly, for a bosonic field:
$$\dot{\phi}(y)\sim\text{Im}\left(\frac{1}{\psi}\frac{\delta\psi}{\delta\phi(y)}\right)\quad \text{or}\quad \dot{\phi}(y)\sim\frac{\delta S}{\delta\phi(y)},$$
where \(\psi[\phi]=|\psi[\phi]|\exp(iS[\phi])\).
Non-canonical solution by Green's functions:
In an arbitrary basis \( \left. | x \right>\), for \(x\in\mathbb{R}^n\),
$$\dot{x}\sim\frac{1}{|\psi(x)|^2} \int_\Omega\mathrm{d}^nx' \frac{\widehat{\Delta x}}{|\Delta x|^{n-1}} \frac{\partial |\psi(x')|^2}{\partial t}.$$
Non-canonical solutions are also possible
For point particles \(i\):
$$\dot{q}_i =\frac{\hbar}{m_i}\text{Im}\left(\frac{\partial_{q_i} \psi}{\psi}\right)\quad\text{or}\quad \dot{q}_i = \frac{\partial_{q_i} S}{m_i},$$
where \(\psi(q_1,q_2,...)=|\psi(q_1,q_2,...)|e^{iS(q_1,q_2,...)/\hbar}\).
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin urna odio, aliquam vulputate faucibus id, elementum lobortis felis. Mauris urna dolor, placerat ac sagittis quis.