Dynamical Systems

ML in Feedback Sys #4

Prof Sarah Dean

Reminders

Sign up to scribe and rank preferences for paper presentations by September 7th
- SR19 "Near optimal finite time identification of arbitrary linear dynamical systems" (9/26)
Required: meet with Atul at least 2 days before you are scheduled to present
Working in pairs/groups, self-assessment

training data

$\{(x_i, y_i)\}$

model

$f:\mathcal X\to\mathcal Y$

policy

observation

action

ML in Feedback Systems

model

$f_t:\mathcal X\to\mathcal Y$

observation

prediction

Online learning

$x_t$

Goal: cumulatively over time, predictions $\hat y_t = f_t(x_t)$ are close to true $y_t$

accumulate

$\{(x_t, y_t)\}$

$$\theta_t = \underbrace{\Big(\sum_{k=1}^{t-1}x_k x_k^\top + \lambda I\Big)^{-1}}_{A_{t-1}^{-1}}\underbrace{\sum_{k=1}^{t-1}x_ky_k }_{b_{t-1}}$$

Follow the (Regularized) Leader

$$\theta_t = \arg\min \sum_{k=1}^{t-1} (\theta^\top x_k-y_k)^2 + \lambda\|\theta\|_2^2$$

Online Gradient Descent

$$\theta_t = \theta_{t-1} - \alpha (\theta_{t-1}^\top x_{t-1}-y_{t-1})x_{t-1}$$

Case study: least-squares

Sherman-Morrison formula: $\displaystyle (A+uv^\top)^{-1} = A^{-1} - \frac{A^{-1}uv^\top A^{-1}}{1+v^\top A^{-1}u} $

Follow the (Regularized) Leader

$$\theta_t = \arg\min \sum_{k=1}^{t-1} (\theta^\top x_k-y_k)^2 + \lambda\|\theta\|_2^2$$

Online Gradient Descent

$$\theta_t = \theta_{t-1} - \alpha (\theta_{t-1}^\top x_{t-1}-y_{t-1})x_{t-1}$$

Recursive least-squares

Recursive FTRL

set $M_0=\frac{1}{\lambda}I$ and $b_0 = 0$
for $t=1,2,...$
- $\theta_t = M_{t-1}b_{t-1}$
- $M_t = M_{t-1} - \frac{M_{t-1}x_t x_t^\top M_{t-1}}{1-x_t^\top M_{t-1}x_t}$
- $b_t = b_{t-1}+x_ty_t$

Today: dynamical world

A world that evolves over time

Difference equation and state space

$$ s_{t+1} = F(s_t)$$

(Autonomous) discrete-time dynamical system where $F:\mathcal S\to\mathcal S$

$\mathcal S$ is the state space. The state is sufficient for predicting its future.

Given initial state $s_0$, the solutions to difference equations, i.e. trajectories: $$ (s_0, F(s_0), F(F(s_0)), ... ) $$

What might trajectories look like?

converging $(1, 0.1, 0.001, 0.0001, ...)$
diverging $(1, 10, 100, 1000,...)$
oscillating $(1, -1, 1, -1, 1, ...)$
converging towards oscillation $(0.9, -0.99, 0.999, -0.9999,...)$

Trajectories

An equilibrium point $s_\mathrm{eq}$ satisfies

$s_{eq} = F(s_{eq})$

Equilibria or Fixed Points

An equilibrium point $s_{eq}$ is

stable if for all $\epsilon>0$, there exists a $\delta=\delta(\epsilon)$ such that for all $t>0$, $$ \|s_0-s_{eq}\|<\delta \implies \|s_t-s_{eq}\|<\epsilon $$
unstable if it is not stable
asymptotically stable if it is stable and $\delta$ can be chosen such that $$ \|s_0-s_{eq}\|<\delta\implies \lim_{t\to\infty} s_t = s_{eq} $$

examples:

$s_{t+1} = s_t$
$s_{t+1} = 2s_t$
$s_{t+1} = 0.5 s_t$

Stability

Suppose that $s_0=v$ is an eigenvector of $A$

$$ s_{t+1} = As_t$$

$$ s_{t} =\lambda^t v$$

Linear dynamics

Consider $\mathcal S = \mathbb R^n$ and linear dynamics

Suppose that $s_0=v$ is an eigenvector of $A$

$$ s_{t+1} = As_t$$

$$ s_{t} =\lambda^t v$$

Linear dynamics

Consider $\mathcal S = \mathbb R^n$ and linear dynamics

$\lambda>1$

Linear dynamics

Consider $\mathcal S = \mathbb R^n$ and linear dynamics

If similar to a real diagonal matrix: $A=VDV^{-1} = \begin{bmatrix} |&&|\\v_1&\dots& v_n\\|&&|\end{bmatrix} \begin{bmatrix} \lambda_1&&\\&\ddots&\\&&\lambda_n\end{bmatrix} \begin{bmatrix} -&u_1^\top &-\\&\vdots&\\-&u_n^\top&-\end{bmatrix} $

$\displaystyle s_t = \sum_{i=1}^n v_i \lambda_i^t (u_i^\top s_0)$ is a weighted combination of (right) eigenvectors

$$ s_{t+1} = As_t$$

Linear trajectories in $n=2$

General case: real eigenvalues with geometric multiplicity equal to algebraic multiplicity

Example 1: $\displaystyle s_{t+1} = \begin{bmatrix} \lambda_1 & \\ & \lambda_2 \end{bmatrix} s_t $

$0<\lambda_2<\lambda_1<1$

$0<\lambda_2<1<\lambda_1$

$1<\lambda_2<\lambda_1$

Exercise: what do trajectories look like when $\lambda_1$ and/or $\lambda_2$ is negative? (demo notebook)

Linear trajectories in $n=2$

Example 2: $\displaystyle s_{t+1} = \begin{bmatrix} \alpha & -\beta\\\beta & \alpha\end{bmatrix} s_t $

$0<\alpha^2+\beta^2<1$

$1<\alpha^2+\beta^2$

Exercise: what do trajectories look like when $\alpha$ is negative? (demo notebook)

General case: pair of complex eigenvalues

$\lambda = \alpha \pm i \beta$

$$\begin{bmatrix}1\\0\end{bmatrix} \to \begin{bmatrix}\alpha\\ \beta\end{bmatrix} $$

rotation by $\arctan(\beta/\alpha)$

scale by $\sqrt{\alpha^2+\beta^2}$

Linear trajectories in $n=2$

Example 3: $\displaystyle s_{t+1} = \begin{bmatrix} \lambda & 1\\ & \lambda\end{bmatrix} s_t $

$0<\lambda<1$

$1<\lambda$

Exercise: what do trajectories look like when $\lambda$ is negative? (demo notebook)

General case: eigenvalues with geometric multiplicity $>1$

$$ \left(\begin{bmatrix} \lambda & \\ & \lambda\end{bmatrix} + \begin{bmatrix} & 1\\ & \end{bmatrix} \right)^t$$

$$ =\begin{bmatrix} \lambda^t & t\lambda^{t-1}\\ & \lambda^t\end{bmatrix} $$

All matrices are similar to a matrix of Jordan canonical form

where $J_i = \begin{bmatrix}\lambda_i & 1 & &\\ & \ddots & \ddots &\\ &&\ddots &1\\ && &\lambda_i \end{bmatrix}\in\mathbb R^{m_i\times m_i}$

Reference: Ch 3d and 4 in Callier & Desoer, "Linear Systems Theory"

Linear trajectories in general

$\begin{bmatrix} J_1&&\\&\ddots&\\&&J_p\end{bmatrix} $

$m_i$ is geometric multiplicity of $\lambda_i$

Linear stability

Theorem: Let $\{\lambda_i\}_{i=1}^n\subset \mathbb C$ be the eigenvalues of $A$.
Then for $s_{t+1}=As_t$, the equilibrium $s_{eq}=0$ is

asymptotically (exponentially, globally) stable $\iff \max_{i\in[n]}|\lambda_i|<1$
unstable if $\max_{i\in[n]}|\lambda_i|> 1$
call $\max_{i\in[n]}|\lambda_i|=1$ "marginally (un)stable"

$\mathbb C$

Linearization via Taylor Series:

$s_{t+1} = F(s_t) $

Stability via linearization

Stability via linear approximation of nonlinear $F$

The Jacobian $J$ of $G:\mathbb R^{n}\to\mathbb R^{m}$ is defined as $$ J(x) = \begin{bmatrix}\frac{\partial G_1}{\partial x_1} & \dots & \frac{\partial G_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial G_m}{\partial x_1} &\dots & \frac{\partial G_m}{\partial x_n}\end{bmatrix}$$

$F(s_{eq}) + J(s_{eq}) (s_t - s_{eq}) $ + higher order terms

$s_{eq} + J(s_{eq}) (s_t - s_{eq}) $ + higher order terms

$s_{t+1}-s_{eq} \approx J(s_{eq})(s_t-s_{eq})$

Consider the dynamics of gradient descent on a twice differentiable function $g:\mathbb R^d\to\mathbb R^d$

$\theta_{t+1} = \theta_t - \alpha\nabla g(\theta_t)$

Jacobian $J(\theta) = I - \alpha \nabla^2 g(\theta)$

Example: gradient descent

Let $\{\gamma_i\}_{i=1}^d$ be the eigenvalues of the Hessian $\nabla^2 g(\theta_{eq})$
Then the eigenvalues of the Jacobian are $1-\alpha\gamma_i$
- if any $\gamma_i\leq 0$, $\theta_{eq}$ is not stable
  - i.e. saddle, local maximum, or degenerate critical point of $g$
- as long as $\alpha<\frac{1}{\gamma_i}$ for all $i$, $\theta_{eq}$ is stable

Definition: A Lyapunov function $V:\mathcal S\to \mathbb R$ for $F,s_{eq}$ is continuous and

(positive definite) $V(s_{eq})=0$ and $V(s_{eq})>0$ for all $s\in\mathcal S - \{s_{eq}\}$
(decreasing) $V(F(s)) - V(s) \leq 0$ for all $s\in\mathcal S$
Optionally,
- (strict) $V(F(s)) - V(s) < 0$ for all $s\in\mathcal S-\{s_{eq}\}$
- (global) $\|s-s_{eq}\|_2\to \infty \implies V(s)\to\infty$

Stability via Lyapunov

Reference: Bof, Carli, Schenato, "Lyapunov Theory for Discrete Time Systems "

Stability via Lyapunov

Theorem (1.2, 1.4): Suppose that $F$ is locally Lipschitz, $s_{eq}$ is a fixed point, and $V$ is a Lyapunov function for $F,s_{eq}$. Then, $s_{eq}$ is

stable
asymptotically stable if $V$ satisfies the strict property
globally asymptotically stable if $V$ satisfies the strict and global properties

Reference: Bof, Carli, Schenato, "Lyapunov Theory for Discrete Time Systems "

Quadratic Lyapunov functions

Stable matrices have quadratic Lyapunov functions of the form $V(s) = s^\top P s$ (Theorem 3.2)
- For example, $P = \sum_{t=0}^\infty (A^\top)^t A^t$
Exercise: show that the above is a strict and global Lyapunov function for $s_{t+1}=As_t$.
When Jacobian $J(s_{eq})$ is stable, can show that $V(s)=(s-s_{eq})^\top P (s-s_{eq})$ is a strict Lyapunov function for $s_{t+1} = F(s_t)$.

Theorem (3.3): Suppose $F$ is locally Lipschitz, $s_{eq}$ is a fixed point, and let $\{\lambda_i\}_{i=1}^n\subset \mathbb C$ be the eigenvalues of the Jacobian $J(s_{eq})$. Then $s_{eq}$ is

asymptotically stable if $\max_{i\in[n]}|\lambda_i|<1$
unstable if $\max_{i\in[n]}|\lambda_i|> 1$

Next time: actions, disturbances, measurement

Recap

References: Bof, Carli, Schenato, "Lyapunov Theory for Discrete Time Systems "; Callier & Desoer, "Linear Systems Theory"

Recursive least squares
Dynamical systems definitions
- difference equations, equilibria, stability
Linear systems
- eigendecomposition for trajectories & stability
Nonlinear stability
- Lyapunov functions & linearization

04 - Dynamical Systems - ML in Feedback Sys

By Sarah Dean

04 - Dynamical Systems - ML in Feedback Sys

Sarah Dean PRO

asst prof in CS at Cornell

sdean.website

Dynamical Systems

ML in Feedback Sys #4

Reminders

ML in Feedback Systems

Online learning

Case study: least-squares

Recursive least-squares

Today: dynamical world

Difference equation and state space

Trajectories

Equilibria or Fixed Points

Stability

Linear dynamics

Linear dynamics

Linear dynamics

Linear trajectories in \(n=2\)

Linear trajectories in \(n=2\)

Linear trajectories in \(n=2\)

Linear trajectories in general

Linear stability

Stability via linearization

Example: gradient descent

Stability via Lyapunov

Stability via Lyapunov

Quadratic Lyapunov functions

Recap

04 - Dynamical Systems - ML in Feedback Sys

04 - Dynamical Systems - ML in Feedback Sys

Sarah Dean PRO

Dynamical Systems

ML in Feedback Sys #4

Reminders

ML in Feedback Systems

Online learning

Case study: least-squares

Recursive least-squares

Today: dynamical world

Difference equation and state space

Trajectories

Equilibria or Fixed Points

Stability

Linear dynamics

Linear dynamics

Linear dynamics

Linear trajectories in \(n=2\)

Linear trajectories in \(n=2\)

Linear trajectories in \(n=2\)

Linear trajectories in general

Linear stability

Stability via linearization

Example: gradient descent

Stability via Lyapunov

Stability via Lyapunov

Quadratic Lyapunov functions

Recap

04 - Dynamical Systems - ML in Feedback Sys

More from Sarah Dean