Fall 2025, Prof Sarah Dean
What about some nonlinear approx?
I.e. train deep network so $$\hat x_{t+1} = NN(\hat x_t)$$
Can we predict stability of predictions?
unstable
stable
"What we do"
\(\mathbb C\)
stable
unstable
inconclusive
\(1\)
Consider the dynamics of gradient descent on a twice differentiable function \(\ell:\mathbb R^d\to\mathbb R^d\)
\(\theta_{t+1} = \theta_t - \alpha\nabla \ell(\theta_t)\)
global max
local max
global min
local min
saddle
Consider the dynamics of gradient descent on a twice differentiable function \(\ell:\mathbb R^d\to\mathbb R^d\)
\(\theta_{t+1} = \theta_t - \alpha\nabla \ell(\theta_t)\)
if any \(\gamma_i\leq 0\), \(\theta_{eq}\) is not strictly stable
i.e. saddle, local maximum, or degenerate critical point of \(\ell\)
global max
local max
global min
local min
saddle
Consider the dynamics of gradient descent on a twice differentiable function \(\ell:\mathbb R^d\to\mathbb R^d\)
\(\theta_{t+1} = \theta_t - \alpha\nabla \ell(\theta_t)\)
global max
local max
global min
local min
saddle
"Why we do it"
Consider linear approximation of nonlinear dynamics
states tend to evolve away from middle equilibrium and towards right (or left)
middle equilibrium has a slope \(>1\) while right/left have slopes \(<1\)
example: discrete-time damped pendulum
\(\theta_{t+1} = \theta_t + h \omega_t\)
\(\omega_{t+1} =\omega_t + h\left(\frac{g}{\ell}\sin\theta_t-d\omega_t\right)\)
angle \(\theta\)
angular velocity \(\omega\)
gravity
length \(\ell\)
\(\approx (1-dh)\omega_t + h\frac{g}{\ell}(\sin \theta_{eq}+\cos\theta_{eq}(\theta-\theta_{eq})\)
\(\sin x\approx \sin x_0 + \cos x_0(x - x_0)\)
equilibria at \(\theta=k\pi\) for \(k\in\mathbb N\)
Consider linear approximation of nonlinear dynamics
example: discrete-time damped pendulum
angle \(\theta\)
angular velocity \(\omega\)
gravity
length \(\ell\)
$$\begin{bmatrix}\theta_{t+1}-\theta_{eq}\\ \omega_{t+1}\end{bmatrix} \approx \begin{bmatrix} 1 & h\\ h \frac{g}{\ell}\cos(\theta_{eq})& 1-dh\end{bmatrix}\begin{bmatrix}\theta_{t}-\theta_{eq}\\ \omega_{t}\end{bmatrix} $$
at \(\theta_{eq}=0\), real eigenvalues \(0<\lambda_2<1<\lambda_1\)
at \(\theta_{eq}=\pi\), complex eigenvalues with \(|\lambda|<1\) for small \(d\)
Exercise: work out the details of this analysis (simulation notebook)
Consider linear approximation of nonlinear dynamics
\(\lambda = 1-h\frac{d}{2} \pm h\sqrt{(\frac{d}{2})^2+\frac{g}{\ell}\cos(\theta_{eq})}\)
example: discrete-time damped pendulum
angle \(\theta\)
angular velocity \(\omega\)
gravity
length \(\ell\)
at \(\theta_{eq}=0\), real eigenvalues \(0<\lambda_2<1<\lambda_1\)
at \(\theta_{eq}=\pi\), complex eigenvalues with \(|\lambda|<1\) for small \(d\)
Consider linear approximation of nonlinear dynamics
Linearization via Taylor Series:
\(s_{t+1} = F(s_t) \)
The Jacobian \(J\) of \(G:\mathbb R^{n}\to\mathbb R^{m}\) is defined as $$ J(x) = \begin{bmatrix}\frac{\partial G_1}{\partial x_1} & \dots & \frac{\partial G_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial G_m}{\partial x_1} &\dots & \frac{\partial G_m}{\partial x_n}\end{bmatrix}$$
"Why we do it"
Definition: A Lyapunov function \(V:\mathcal S\to \mathbb R\) for \(F\) is continuous and
Reference: Bof, Carli, Schenato, "Lyapunov Theory for Discrete Time Systems"
Theorem (1.2, 1.4): Suppose that \(F\) is locally Lipschitz, \(s_{eq}=0\) is a fixed point, and \(V\) is a Lyapunov function. Then, \(s_{eq}=0\) is
Reference: Bof, Carli, Schenato, "Lyapunov Theory for Discrete Time Systems"
Theorem (3.3): Suppose \(F\) is locally Lipschitz, \(0\) is a fixed point, and let \(\{\lambda_i\}_{i=1}^d\subset \mathbb C\) be the eigenvalues of the Jacobian \(J(0)\). Then \(0\) is
Proof sketch.
\(\mathbb C\)
stable
unstable
inconclusive
\(1\)
"Why we do it"
\(r=1.5\) stable at \(s=\frac{1}{3}\)
\(r=3.25\) limit cycle
\(r=3.75\) chaotic
Next time: learning nonlinear models
References: Bof, Carli, Schenato, "Lyapunov Theory for Discrete Time Systems"