ISING LECTURES-2026
28th Annual Workshop on Critical Phenomena and Complex Systems
Jan Korbel
Slides available at: slides.com/jankorbel
Personal web: jankorbel.eu
Beethoven
Udo Jürgens
This talk will show you how important is Boltzmann's formula for statistical physics of complex systems
What if we have 1 mol (\(\approx 10^{23})\) particles?
Let's have canonical coordinates \(\mathbf{q}(t)\), \(\mathbf{p}(t)\) evolving by Hamiltonian dynamics
$$\dot{\mathbf{q}} = \frac{\partial H}{\partial \mathbf{p}}\qquad \dot{\mathbf{p}} = - \frac{\partial H}{\partial \mathbf{q}}$$
Let \(\rho(p,q,t)\) be a probability distribution in the phase space. Then, \(\frac{\mathrm{d} \rho}{\mathrm{d} t} = 0.\)
Consequence: \( \frac{\mathrm{d} S(\rho)}{\mathrm{d} t}= - \frac{\mathrm{d}}{\mathrm{d} t} \left(\int \rho(t) \ln \rho(t)\right) = 0.\)
1. Law of large numbers (LLN)
\( \sum_{i=1}^n X_i \rightarrow n \bar{X} \quad \mathrm{for} \ N \gg 1\)
2. Central limit theorem (CLT)
\( (\frac{1}{n} \sum_{i=1}^n X_i - \bar{X}) \rightarrow \frac{1}{\sqrt{n}} \mathcal{N}(0,\sigma^2)\)
Consequence: a large number of i.i.d. subsystems can be described by very few parameters for \(N \gg 1\)
\(\Rightarrow\) e.g., a box with 1 mol of gas particles
Bars & Stars theorems (|*)
\(\bar{X}\)
Consider a dice with 6 states
Let us throw a dice 5 times. The resulting sequence is
Microstate
The histogram of this sequence is
0
0
2
1
1
1
Mesostate
The average value is 3,8 Macrostate
Coarse-graining
Coarse-graining
# micro: \(6^5 =7776\)
# meso: \(\binom{6+5-1}{5} =252\)
# macro: \( 5\cdot 6-5\cdot 1 =25\)
Now we come back to the formula on Boltzmann's grave
Question: how do we calculate multiplicity W for mesostate
Answer: see combinatorics lecture.
Full answer: 1.) permute all states, 2.) take care of overcounting
1.) Permuation of all states: 5! = 120
2.) Overcounting - permutation of 2! = 2
Together: \(W(0,2,0,1,1,1) = \frac{5!}{2!} =60\)
0
0
2
1
1
1
$$S = \log W \approx n\log n - n - \sum_{i} \left(n_i \log n_i - n_i \right)$$
Here, we use the normalization \(n = \sum_i n_i\)
and introduce probabilities \(p_i = n_i/n\)
$$ S = - n\sum_i p_i \log p_i$$
Finally the entropy per particle is
We use the Boltzmann formula (we set \(k_B = 1\) )
and Stirling's approximation \(\ln x! \approx x \ln x - x\)
We actually ended with the formula that is known as Shannon entropy in information theory
\( \mathcal{S} = S/n = -\sum_i p_i \log p_i \)
BGS entropy is a consequence of Boltzmann
By using the relation \(n_i/n = p_i\), we actually used the law of large numbers. It says that for large \(n\), the relative frequency of an event converges to its probability, i.e.,
\(p_i = \lim_{n \rightarrow \infty} \frac{n_i(n)}{n}\)
This limit is in physics called thermodynamic limit.
What is actually \(p_i\)?
0
0
1
1
1
2
P( )=1/3
0
0
2
1
1
1
Possible mesostates:
0
1
2
0
1
1
How many mesostates are there?
0
0
1
0
4
0
etc.
We introduce the following notation:
0
0
1
1
1
\(=(6,5,4,2,2)\)
2
| (6,6,5,1,1) | (6,6,4,2,1) | (6,6,3,3,1) |
|---|---|---|
| (6,6,3,2,2) | (6,5,5,2,1) | (6,5,4,3,1) |
| (6,5,4,2,2) | (6,5,3,3,2) | (6,4,4,4,1) |
| (6,4,4,3,2) | (6,4,3,3,3) | (5,5,5,3,1) |
| (5,5,4,4,1) | (5,5,4,3,2) | (5,4,4,4,2) |
| (5,4,4,3,3) | (4,4,4,4,3) | (5,5,5,2,2) |
| (5,5,3,3,3) |
All mesostates with \(\bar{X} = 3.8\)
Q: What is the probability of observing such mesostate?
A: It is the multiplicity!
| W(6,6,5,1,1)=30 | W(6,6,4,2,1)=60 | W(6,6,3,3,1)=30 |
|---|---|---|
| W(6,6,3,2,2)=30 | W(6,5,5,2,1)=60 | W(6,5,4,3,1)=120 |
| W(6,5,4,2,2)=60 | W(6,5,3,3,2)=60 | W(6,4,4,4,1)=20 |
| W(6,4,4,3,2)=60 | W(6,4,3,3,3)=20 | W(5,5,5,3,1)=20 |
| W(5,5,4,4,1)=30 | W(5,5,4,3,2)=60 | W(5,4,4,4,2)=20 |
| W(5,4,4,3,3)=30 | W(4,4,4,4,3)=5 | W(5,5,5,2,2)=10 |
| W(5,5,3,3,3)=10 |
$$W(n_1,\dots,n_k) = \frac{n!}{n_1! n_2! \dots n_k!}$$
P( )
| P3(6,6,5,1,1)=0 | P3(6,6,4,2,1)=0 | P3(6,6,3,3,1)=0.4 |
|---|---|---|
| P3(6,6,3,2,2)=0.2 | P3(6,5,5,2,1)=0 | P3(6,5,4,3,1)=0.2 |
| P3(6,5,4,2,2)=0 | P3(6,5,3,3,2)=0.4 | P3(6,4,4,4,1)=0 |
| P3(6,4,4,3,2)=0.2 | P3(6,4,3,3,3)=0.6 | P3(5,5,5,3,1)=0.2 |
| P3(5,5,4,4,1)=0 | P3(5,5,4,3,2)=0.2 | P3(5,4,4,4,2)=0 |
| P3(5,4,4,3,3)=0.4 | P3(4,4,4,4,3)=0.2 | P3(5,5,5,2,2)=0 |
| P3(5,5,3,3,3)=0.6 |
$$P(\qquad) = \frac{\sum_{m_i} P_3(m_i) W(m_i)}{\sum_{m_i} W(m_i)} = \frac{113}{605} \approx 18.7\%$$
here \(m_i\) are the mesostates that satisfy the constraint
This is quite complicated!
But what happens if we increase the number of dice?
Will it get more complicated or easier?
$$W(2n_1,\dots,2n_k) = \frac{(2n)!}{(2n_1)! (2n_2)! \dots (2n_k)!}$$
| W2(6,6,5,1,1)=3150 | W2(6,6,4,2,1)=12600 | W2(6,6,3,3,1)=6300 |
|---|---|---|
| W2(6,6,3,2,2)=3150 | W2(6,5,5,2,1)=12600 | W2(6,5,4,3,1)=113400 |
| W2(6,5,4,2,2)=12600 | W2(6,5,3,3,2)=12600 | W2(6,4,4,4,1)=420 |
| W2(6,4,4,3,2)=12600 | W2(6,4,3,3,3)=420 | W2(5,5,5,3,1)=420 |
| W2(5,5,4,4,1)=6300 | W2(5,5,4,3,2)=12600 | W2(5,4,4,4,2)=420 |
| W2(5,4,4,3,3)=3150 | W2(4,4,4,4,3)=45 | W2(5,5,5,2,2)=210 |
| W2(5,5,3,3,3)=210 |
Let us denote \(W_2(n_1,\dots,n_k) = W(2n_1,\dots,2n_k)\)
What happens to the multiplicities?
Note: there are also other mesostates that are not double-configurations
As we saw in the previous lecture, the multinomial multiplicity
$$W(n_1,\dots,n_k) = \binom{n}{n_1,\dots,n_k}$$
leads to Boltzmann-Gibbs-Shannon entropy.
$$S = - \sum_k p_k \log p_k$$
Are there systems with non-multinomial multiplicity?
What is their entropy?
1. Maxwell-Boltzmann statistics - \(N\) distinguishable particles, \(N_i\) particles in state \(\epsilon_i\)
Multiplicity can be calculated as
$$W(N_1,\dots,N_k) = \binom{N}{N_1} \binom{N-N_1}{N_2} \dots \binom{N-\sum_{i=1}^{k-1}N_1}{N_k} = N! \prod_{i=1}^k \frac{1}{N_i!}$$
If \(\epsilon_i\) has degeneracy \(g_i\), then
$$W(N_1,\dots,N_k) = N! \prod_{i=1}^k \frac{g_i^{N_i}}{N_i!}$$
Then, we get
$$S_{MB} = - N \sum_{i=1}^k p_i \log \frac{p_i}{g_i}$$
2. Bose-Einstein statistics - \(N\) indistinguishable particles, \(N_i\) particles in state \(\epsilon_i\) with degeneracy \(g_i\),
Multiplicity can be calculated as
$$W(N_1,\dots,N_k) = \prod_{i=1}^k \binom{N_i + g_i-1}{N_i}$$
Let us introduce \(\alpha_i = g_i/N\). Then, we get
$$S_{BE} = N \sum_{i=1}^k \left[(\alpha_i + p_i) \log (\alpha_i +p_i) - \alpha_i \log \alpha_i - p_i \log p_i\right]$$
(|*)
3. Fermi-Dirac statistics - \(N\) indistinguishable particles, \(N_i\) particles in state \(\epsilon_i\) with degeneracy \(g_i\), maximally 1 particle per sub-level (thus \(N_i \leq g_i\))
Multiplicity can be calculated as
$$W(N_1,\dots,N_k) = \prod_{i=1}^k \binom{g_i}{N_i}$$
Let us introduce \(\alpha_i = g_i/N\). Then, we get
$$S_{FD} = N \sum_{i=1}^k \left[-(\alpha_i - p_i) \log (\alpha_i -p_i) + \alpha_i \log \alpha_i - p_i \log p_i\right]$$
Let us start with a simple example of a coin tossing.
States are:
But! let's make a small change, we consider magnetic coins
The bound (or sticky) state is simply
State space grows super-exponentially (\(W(n) \sim n^n \sim e^{n \log n}\) )
picture from: Jensen et al 2018 J. Phys. A: Math. Theor. 51 375002
2 x 1x
1 x 1x
3
3
Microstate
Mesostate
Mesostate
2 x 1x
1 x 1x
1 1 2 2 3 3
2 3 1 3 1 2
3 2 3 1 2 1
1 1 2 2 3 3
2 3 1 3 1 2
3 2 3 1 2 1
= (1,2,3) , (2,1,3)
= (1,3,2) , (3,1,2)
= (2,3,1) , (3,2,1)
= (1,2,3) , (1,3,2)
= (2,1,3) , (2,3,1)
= (3,1,2) , (3,2,1)
we have \(n_i^{(j)}\) molecules of size \(j\) in a state \(s_i^{(j)}\)
$$ S = \log W \approx n \log n - \sum_{ij} \left(n_i^{(j)} \log n_i^{(j)} - n_i^{(j)} + {\color{red} n_i^{(j)} \log j!}\right)$$
Introduce "probabilities" \(\wp_i^{(j)} = n_i^{(j)}/n\)
$$\mathcal{S} = S/n = - \sum_{ij} \wp_i^{(j)} (\log \wp_i^{(j)} {\color{red}- 1}) {\color{red}- \sum_{ij} \wp_i^{(j)}\log \frac{j!}{n^{j-1}}}$$
Finite interaction range: concentration \(c = n/b\)
$$\mathcal{S} = S/n = - \sum_{ij} \wp_i^{(j)} (\log \wp_i^{(j)} {\color{red}- 1}) {\color{red}- \sum_{ij} \wp_i^{(j)}\log \frac{j!}{{\color{purple}c^{j-1}}}}$$
Nat. Comm. 12 (2021) 1127
Corominas Murtra, Hanel, Thurner PNAS 112(17) (2015) 5348
The number of states is \(n\). Let us denote the states as \(x_n \rightarrow \dots \rightarrow x_1\), where \(x_1\) is the ground state, where the process restarts. Let us sample \(R\) relaxation sequences \(x = (x_{k_1},\dots,x_1)\).
The sequences can be visualised as
How many of these sequences contain a state \(x_{j}\) exactly \(k_j\) times?
Each run must contain \(x_1\)
Number or runs \(R \equiv k_1\), number of them containing \(x_j\) is \(k_j\)
Multiplicity of these sequences: \(\binom{k_1}{k_j}\)
By multiplying the multiplicity for each state we get
$$W(k_1,\dots,k_n) = \prod_{j=2}^n \binom{k_1}{k_j}$$
$$ \log W \approx \sum_{j=2}^n \left[k_1 \log k_1 - \cancel{k_1} - k_j \log k_j + \cancel{k_j} - (k_1-k_j) \log (k_1 - k_j) + \cancel{(k_1-k_j)}\right]$$
$$\approx \sum_{j=2}^n \left[ k_1 \log k_1 \textcolor{red}{-k_j \log k_1} - k_j log \frac{k_j}{\textcolor{red}{k_1}} - (k_1-k_j) \log(k_1-k_j) \right]$$
By introducing \(p_i = k_i/N\) where \(N\) is the total number of steps, we get
$$S_{SSR}(p) = - N \sum_{j=2}^n \left[p_i \log \left(\frac{p_i}{p_1}\right) + (p_1-p_i) \log \left(1-\frac{p_i}{p_1}\right)\right]$$
Hanel, Corominas Murtra, Thurner New J. Phys. (2017) 19 033008
We have \(c\) colors, initially \(n_i(0) \equiv n_i\) balls of color \(c_i\). After a ball is drawn, we return \(\delta\) balls of the same color to the urn.
After \(N\) draws, the number of balls in the urn is
$$n_i(N) = n_i + \delta k_i$$
where \(k_i\) is the number of draws of color \(c_i\). The total number of balls is \(n(R) = \sum_{c} n_c(N) = N + \delta N\)
The probability of drawing a ball of color \(c_i\) in \(N\)-th run, is \(p_i(N) = n_i(N)/n(N)\). The probability of sequence
\(\ \mathcal{I} =\{i_1,\dots,i_N\} \) is
$$p(\mathcal{I}) = \prod_{j=1}^c \frac{n_{j}^{(\delta,k_j)}}{n^{(\delta,N)}}$$
where \(m^{(\delta,r)} = m(m+\delta)\dots(m+r\delta)\)
A histogram \(\mathcal{K} = \{k_1,\dots,k_c\}\) is defined as \(k_c = \sum_{j = 1}^N \delta(i_j,c) \)
Thus the probability of observing a histogram is
$$ p(\mathcal{K}) = \binom{N}{k_1,\dots,k_c} p(\mathcal{I}) $$
\(n_{j}^{(\delta,k_{j})} \approx k_{j}! \delta^{k_{j}} (k_i+1)^{n_i/\delta} \)
\( p(\mathcal{K}) = \frac{N!}{\prod_{j=1}^c \cancel{k_j!}} \frac{\prod_{j=1}^c \cancel{k_j!} \delta^{k_j} (k_j+1)^{n_j/\delta}}{n^{(\delta,N)}} \)
... technical calculation ...
$$S_{Pólya}(p) = \log p(\mathcal{K}) \approx - \sum_{i=1}^c \log(p_i + 1/N)$$
This example is rather theoretical, but provides us a useful hint of what happens if there are correlations in the sample space
Motivation: finite versions of \(\exp\) and \(\log\)
\(\exp(x) = \lim_{n \rightarrow \infty} \left(1+\frac{x}{n}\right)^n\)
So define
\(\exp_q(x) := \left(1 + (1-q) x\right)^{1/(1-q)}\)
\(\log_q(x) := \frac{x^{1-q}-1}{1-q}\)
Let us find an operation s.t.
\(\exp_q(x) \otimes_q \exp_q(y) \equiv \exp_q(x+y)\)
\(\Rightarrow a \otimes_q b = \left[a^{1-q} + b^{1-q}-1\right]^{1/(1-q)}\)
Suyari, Physica A 368 (2006) 63-82
In analogy to \(n! = 1 \cdot 2 \cdot \dots \cdot n\) introduce \(n!_q := 1 \otimes_q 2 \otimes_q \dots \otimes_q n\)
It is than easy to show that \(\log_q n!_q = \frac{\sum_{k=1}^n k^{1-q} - n}{1-q}\) which can be used for generalized Stirling's approximation \(\log_q n!_q \approx \frac{n}{2-q} \log_q n\)
Let us now consider a q-deformed multinomial factor
$$\binom{n}{n_1,\dots,n_k}_q := n!_q \oslash_q (n_1!_q \otimes_q \dots \otimes_q n_k!_q) $$
$$ = \left[\sum_{l=1}^n l^{1-q} - \sum_{i_1}^{n_1} i_i^{1-q} - \dots - \sum_{i_k}^{n_k} i_k^{1-q}\right]^{1/(1-q)}$$
Let us consider that the multiplicity is given by a q-multinomial factor \( W(n_1,\dots,n_k) = \binom{n}{n_1,\dots,n_k}_q\)
In this case, it is more convenient to define entropy as \(S = \log_q W\), which gives us:
$$\log_q W = \frac{n^{2-q}}{2-q} \frac{\sum_{i=1}^k p_i^{2-q}-1}{q-1} = \frac{n^{2-q}}{2-q} S_{2-q}(p)$$
This entropy is known as Tsallis entropy
Note that the prefactor is not \(n\) but \(n^{2-q}\)
(non-extensivity) - we will discuss this later
Until now, we have been just counting states; let us now discuss the relation with energy.
We consider that the states describe the energy of the system (either Hamiltonian or more generalized energy functional)
Therefore, entropy is defined as
$$ S(E) := \log W(E)$$
There are a few typical situations:
1. Isolated system = microcanonical ensemble
Let \(H(s)\) be the energy of a state \(s\). Multiplicity is then \(W(E) = \sum_{s} \delta(H(s) - E)\)
Phenomena like negative "temperature" \(T = \frac{\mathrm{d} S(E)}{\mathrm{d} E} < 0\)
2. closed system = canonical ensemble
Total system is composed of the system of interest (S) and the heat reservoir/bath (B). They are weakly coupled i.e., \(H_{tot}(s,b) = H_S(s) + H_B(b)\) (no interaction energy)
$$W(E_{tot}) = \sum_{s,b} \delta(H_{tot}(s,b) -E_{tot})$$
3. open system = grandcanonical ensemble
This can be further rewritten as
$$= \int \mathrm{d} E_S\, \sum_s \delta (H_{S}(s)-E_S) \sum_b(H_B(b)-(E_{tot}-E_S)) $$
$$= \int \mathrm{d} E_S \, W_S(E_S) W_B(E_{tot}-E_S) $$
This is hard to calculate. Typically, the dominant contribution is from the maximal configuration of the integrand, which we obtain from
$$\frac{\partial W_S(E_S) W_B(E_{tot}-E_S)}{\partial E_S} \stackrel{!}{=} 0 \Rightarrow \frac{W'_S(E_S)}{W_S(E)} = \frac{W'_B(E_{tot}-E_S)}{W_B(E_{tot}-E_S)}$$
As a consequence \(\frac{\partial S_E(E_S)}{\partial E_S} \stackrel{!}{=} \frac{\partial S_B(E_{tot}-E_S)}{\partial E_S} := \frac{1}{k_B T}\)
and \(\underbrace{S_B(E_{tot} - E_S)}_{free \ entropy} = \underbrace{S_B(E_{tot})}_{bath \ entropy} - \frac{\partial S}{\partial E_S} E_S + \dots\)
This is the emergence of Maximum entropy principle
S
B
weak coupling
S
coarse-graining
Deterministic picture
Statistical
picture
General approach - method of Lagrange multipliers
Maximize \(L(p) = S(p) - \alpha \sum_i p_i - \sum_k \lambda_k \sum_i I_{i,k} p_i\)
$$\frac{\partial L}{\partial p_i} = \frac{\partial S(p)}{\partial p_i} - \alpha - \sum_k \lambda_k I_{i,k} \stackrel{!}{=} 0$$
In case \(\psi_i(P) = \frac{\partial S(p)}{\partial p_i}\) is invertible for \(p_i\), we get that
$$ p^{\star}_i = \psi_i^{(-1)}\left(\alpha + \sum_k \lambda_k I_{i,k}\right)$$
Legendre structure of thermodynamics - interpretation of L
$$L(p) = S(p) - \beta U(p) = \Psi(p) = - \beta F(p)$$
free entropy
Maxwell-Boltzmann
\(S_{MB} = - \sum_{i=1}^k p_i \log \frac{p_i}{g_i}\)
\(p_i^\star = \frac{g_i}{Z} \exp(-\epsilon_i/ T) \)
Bose-Einstein
\(S_{BE} = \sum_{i=1}^k \left[(\alpha_i + p_i) \log (\alpha_i +p_i) - \alpha_i \log \alpha_i - p_i \log p_i\right]\)
\(p_i^\star = \frac{\alpha_i}{Z} \frac{1}{\exp(\epsilon_i/ T)-1} \)
Fermi-Dirac
\(S_{FD} = \sum_{i=1}^k \left[-(\alpha_i - p_i) \log (\alpha_i -p_i) + \alpha_i \log \alpha_i - p_i \log p_i\right]\)
\(p_i^\star = \frac{\alpha_i}{Z} \frac{1}{\exp(\epsilon_i/ T)+1} \)
$$S(\wp) =- \sum_{ij} \wp_i^{(j)} (\log \wp_i^{(j)} - 1) - \sum_{ij} \wp_{ij} \log \frac{j!}{n^{j-1}}$$
Normalization: \(\sum_{ij} j \wp_{i}^{(j)} =1 \) Energy: \(\sum_{ij} \epsilon_{i}^{(j)} \wp_{i}^{(j)} = U\)
where \( j \wp_i^{(j)} = p_i^{(j)}\)
MaxEnt distribution: \(\wp_i^{(j)} = \frac{n^{j-1}}{j!} \exp(-\alpha j -\beta \epsilon_i^{(j)})\)
The normalization condition gives \(\sum_j j \mathcal{Z}_j e^{-\alpha j} = 1 \)
where \(\mathcal{Z}_j = \frac{n^{j-1}}{j!} \sum_i \exp(-\beta \epsilon_i^{(j)}) \) is the partial partition function
We get a polynomial equation in \(e^{-\alpha}\)
Average number of molecules \(\mathcal{M} = \sum_{ij} \wp_{i}^{(j)}\)
Free energy: \(F = U - T S = -\frac{\alpha}{\beta} - \frac{\mathcal{M}}{\beta} \)
$$S_q(p) = \frac{\sum_i p_i^q-1}{1-q}$$
MaxEnt distribution is: \(p_i^\star = \exp_q(\alpha+\beta \epsilon_i)\)
Note that this is not equal in general to \(q_i^\star =\frac{\exp_q(\beta \epsilon_i)}{\sum_i \exp_q( \beta \epsilon_i)}\)
However, it is possible to use the identity
$$\exp_q(x+y) = \exp_q(x) \exp_q\left(\frac{y}{\exp_q(x))^{1-q}}\right)$$
The MaxEnt distribution of Tsallis entropy can be expressed as
$$p_i^\star(\beta) = \exp_q(\alpha+\beta \epsilon_i) = \exp_q(\alpha) \exp_q(\tilde{\beta}\epsilon_i) = q_i^\star(\tilde{\beta}) $$
where \( \tilde{\beta} = \frac{\beta}{\exp_q(\alpha)^{1-q}}\)
(sometimes called self-referential temperature)
and relative entropy
The cross-entropy corresponds to the constraints
For the case of expected energy, it can be expressed through the cross entropy
$$S_{cross}(p|q) = - \sum_i p_i \log q_i $$
where \(q_i\) are prior probabilities. By taking \(q^\star_i = \frac{1}{Z}e^{-\beta \epsilon_i}\) we get
$$S_{cross}(p|q^\star) = \beta\sum_i p_i \epsilon_i + \ln Z$$
However, for the case of path-dependent process, the natural constraints might not be of this form
Kullback-Leibler divergence
$$D_{KL}(p||q) = -S(p) + S_{cross}(p,q) $$
$$S_{SSR}(p) = - N \sum_{j=2}^n \left[p_i \log \left(\frac{p_i}{p_1}\right) + (p_1-p_i) \log \left(1-\frac{p_i}{p_1}\right)\right]$$
From multiplicity of trajectory histograms, we have shown that the entropy of SSR is
Let us now consider that after each run (when the system reaches the ground state) we drive the ball to a random state with probability \(q_i\)
After each jump the effective space reduces
One can see that the probability of sampling a histogram \(k_i\) is
$$G(k|q) = \prod_{i=1}^n \frac{q_i^{k_i}}{Q_{i-1}^{k_i}} $$
where \(Q_i = \sum_{j=1}^i q_i\) and \(Q_0 \equiv 1\).
$$S_{cross}(p|q) = - \sum_{i=1}^n p_i \log q_i - \sum_{i=2}^n p_i \log Q_{i-1} $$
By assuming in \(q_i \propto e^{-\beta \epsilon_i}\) the cross-entropy is
$$S_{cross}(p|q) = \beta \sum_{i=1}^n p_i \epsilon_i + \beta \sum_{i=2}^n p_i f_i = \mathcal{E} + \mathcal{F}$$
where \(f_i = \ln \sum_i e^{-\beta \epsilon_i}\)
Probability of observing a histogram
$$ p(\mathcal{K}) = \binom{N}{k_1,\dots,k_c} p(\mathcal{I}) $$
By carefully taking into account the initial number of balls in the urn \(n_i\) we end with
$$S_{Pólya}(p) = - \sum_{i=1}^c \log(p_i + 1/N)$$
$$S_{Pólya}(p|q) = - \sum_{i=1}^c \left[\frac{q_i}{\gamma} \log \left(p_i + \frac{1}{N}\right) - \log\left(1+\frac{1}{N\gamma} \frac{q_i - \gamma}{p_i+\frac{1}{N}}\right)+ \log q_i\right]$$
where \(q_i = n_i/N, \gamma=\delta/N\)
$$S_{Pólya}(p) = - \sum_{i=1}^c \log p_i$$
$$S_{Pólya}(p|q) = - \sum_{i=1}^c \left[\frac{q_i}{\gamma} \log p_i + \log q_i\right]$$
By taking \(N \rightarrow \infty\), we get
with some assumptions for p and q
As we already found out, the MaxEnt principle can be seen as a special case of the principle of minimum relative entropy
$$p^\star = \arg\min_p D(p||q)$$
In many cases, the divergence can be expressed as \(D(p||q) = - S(p) + S_{cross}(p,q)\)
It connects information theory, thermodynamics and geometry
Priors \(q\) can be obtained from theoretical models or measurements
Posteriors \(p\) can be from parametric family or from a special class of probability distributions
Relative entropy is well defined for both discrete and continuous distributions
Let us now consider the whole trajectory \(\pmb{x}(t)\) with probability \(p(\pmb{x}(t))\)
We define the term caliber, which is the KL-divergence of the path probability
$$S_{cal}(p|q) = \int \mathcal{D} \pmb{x}(t) p(\pmb{x}) \log \frac{p(\pmb{x}(t))}{q(\pmb{x}(t))}$$
N.B.: Entropy production can be written in terms of caliber as
$$\Sigma_t = S_{cal}[p(\pmb{x}(t))|\tilde{p}(\tilde{\pmb{x}}(t))]$$
Now we do the opposite approach compared to lecture II.
We postulate the properties we think entropy should have
and derive the corresponding entropic funcional
These axiomatic approaches have different nature, we will discuss their possible connection
Introduced independently by Shannon and Khinchin
Motivated by information theory
These four axioms uniquely determine Shannon entropy
\(S(P) = - \sum_i p_i \log p_i\)
SK axioms serve as a starting point for other axiomatic schemes
Several axiomatic schemes generalize axiom SK4.
One possibility is to generalize additivity. The most prominent example is q-additivity
$$ S(A \cup B) = S(A) \oplus_q S(B|A)$$
where \(x \oplus_q y = x + y + (1-q) xy\) is q-addition
\(S(B|A)= \sum_i \rho_i(q)^A S(B|A=a_i)\) is conditional entropy
and \(\rho_i = p_i^q/\sum_k p_k^q\) is escort distribution.
This uniquely determines Tsallis entropy
$$S_q(p) = \frac{1}{1-q}\left(\sum_i p_i^q-1\right)$$
Abe, Phys. Lett. A 271 (2000) 74.
Another possibility is to consider a different type of averaging
In the original SK axioms, the conditional entropy is defined as the arithmetic average of \(S(B|A=a_i)\)
We can use alternative averaging, as Kolmogoro-Nagumo average
$$\langle X \rangle_f = f^{-1} \left(\sum_i p_i f(x_i)\right)$$
By keeping addivity, but taking \(S(B|A)= f^{-1}(\sum_i \rho_i(q)^A f(S(B|A=a_i))\)
for \(f(x) = \frac{e^{(1-q)x}-1}{1-q}\) we uniquely obtain Rényi entropy
$$R_q(p) = \frac{1}{1-q}\log \sum_i p_i^q$$
Jizba, Arimitsu, Annals of Physics 312 (1) (2004)17-59
P.J., J.K. Phys. Rev. Lett. 122 (2019), 120601
Are the axioms set by theory of information and statistical inference different or can we find some overlap?
Let us consider the 4th SK axiom
in the form equivalent to composability axiom by P. Tempesta:
4. \(S(A \cup B) = f[f^{-1}(S(A)) \cdot f^{-1}(S(B|A))]\)
\(S(B|A) = S(B)\) if B is independent of A.
Entropies fulfilling SK and SJ: $$S_q^f(P) = f\left[\left(\sum_i p_i^q\right)^{1/(1-q)}\right] = f\left[\exp_q\left( \sum_i p_i \log_q(1/p_i) \right)\right]$$
Phys. Rev. E 101, 042126 (2020)
Lieb and Yngvason define thermodynamics via adiabatic accessibility following axioms:
$$X \sim (\lambda X, (1-\lambda)X)$$
Entropy in LY framework should fulfill these axioms
Does BE satisfy LY axioms?
Appropriate for ''simple'' systems
We have been mentioning the issue of extensivity before
Let us see how the multiplicity and entropy scales with size \(N\)
This allows us to introduce a classification of entropies
J.K., R.H., S.T. New J. Phys. 20 (2018) 093007
| Process | S(W) | |||
|---|---|---|---|---|
| Random walk |
0 |
1 |
0 |
|
| Aging random walk |
0 |
2 |
0 |
|
| Magnetic coins * |
0 |
1 |
-1 |
|
| Random network |
0 |
1/2 |
0 |
|
| Random walk cascade |
0 |
0 |
1 |
\( \log W\)
\( (\log W)^2\)
\( (\log W)^{1/2}\)
\( \log \log W\)
\(d_0\)
\(d_1\)
\(d_2\)
\( \log W/\log \log W\)
* H. Jensen et al. J. Phys. A: Math. Theor. 51 375002
\( W(N) = 2^N\)
\(W(N) \approx 2^{\sqrt{N}/2} \sim 2^{N^{1/2}}\)
\( W(N) \approx N^{N/2} e^{2 \sqrt{N}} \sim e^{N \log N}\)
\(W(N) = 2^{\binom{N}{2}} \sim 2^{N^2}\)
\(W(N) = 2^{2^N}-1 \sim 2^{2^N}\)
How does it change for one more scaling exponent?
R.H., S.T. EPL 93 (2011) 20006
To fulfill SK axiom 2 (maximality): \(d_l > 0\), to fulfill SK axiom 3 (expandability): \(d_0 < 1\)
Thank you for your attention
csh.ac.at