# Dynamical systems' based neural networks

Davide Murari

davide.murari@ntnu.no

Theoretical and computational aspects of dynamical systems

HB60

What are neural networks

They are compositions of parametric functions

$$\mathcal{N}(x) = f_{\theta_k}\circ ... \circ f_{\theta_1}(x)$$

ResNets

$$\Sigma(z) = [\sigma(z_1),...,\sigma(z_n)],$$

$$\sigma:\mathbb{R}\rightarrow\mathbb{R}$$

f_{\theta}(x) = x + B\Sigma(Ax+b),\\ \theta = (A,B,b)

Neural networks motivated by dynamical systems

\mathcal{N}(x) = \Psi_{f_k}^{h_k}\circ ...\circ \Psi_{f_1}^{h_1}(x)

$$\dot{x}(t) = h(x(t),\theta(t))=:h_{s(t)}(x(t))$$

Where $$f_i(x) = f(x,\theta_i)$$

\theta(t)\equiv \theta_i,\,\,t\in [t_i,t_{i+1})
t_0
t_1
t_2
t_i
t_{i+1}
t_M=T
\cdots
\cdots
\cdots
h_i

{

Neural networks motivated by dynamical systems

What if I want a network with a certain property?

GENERAL IDEA

EXAMPLE

Property $$\mathcal{P}$$

$$\mathcal{P}=$$Volume preservation

Family $$\mathcal{F}$$ of vector fields that satisfy $$\mathcal{P}$$

$$X_{\theta}(x,v) = \begin{bmatrix} \Sigma(Av+a) \\ \Sigma(Bx+b) \end{bmatrix}$$

$$\mathcal{F}=\{X_{\theta}:\,\,\theta\in\mathcal{A}\}$$

Integrator $$\Psi^h$$ that preserves $$\mathcal{P}$$

x_{n+1}=x_n+h\Sigma(Av_n+a)\\ \,\,\,\,v_{n+1}=v_n+h\Sigma(Bx_{n+1}+b)

1.

2.

3.

# Mass-preserving networks

\dot{y} = \begin{bmatrix} 0 & -y_3y_1^2 & y_2y_3 \\ y_3y_1^2 & 0 & -\sin{y_1} \\ -y_2y_3 & \sin{y_1} & 0\end{bmatrix}\boldsymbol{1}
\dot{x}(t) = \{A_{\theta}(x(t))-A_{\theta}(x(t))^T\}\boldsymbol{1}\\ \mathrm{vec}(A_{\theta}(x)) = \Sigma(Ux+u)

Lipschitz-constrained networks

$$m=1$$

$$m=\frac{1}{2}$$

$$\Sigma(x) = \max\left\{x,\frac{x}{2}\right\}$$

We consider orthogonal weight matrices

Lipschitz-constrained networks

X_{\theta_i}(x) := - \nabla V_{\theta_i}(x) = -A_i^T\Sigma(A_ix+b_i)
\Psi^{h_C}_{X_{\theta_i}}(x) = x - {h_C}A_i^T\Sigma(A_ix+b_i)
Y_{\theta_i}(x) := \Sigma(W_ix + v_i)
\|\Psi^{h_C}_{X_{\theta_i}}(y) - \Psi^{h_C}_{X_{\theta_i}}(x)\|\leq \sqrt{1-{h_C}+{h_C}^2}\|y-x\|
\Psi^{h_E}_{Y_{\theta_i}}(x) = x + {h_E}\Sigma(W_ix+v_i)
\|\Psi^{h_E}_{Y_{\theta_i}}(y) - \Psi^{h_E}_{Y_{\theta_i}}(x)\|\leq (1+{h_E})\|y-x\|

Lipschitz-constrained networks

\mathcal{N}(x)=\Psi_{X_{\theta_{2k}}}^{h_{2k}} \circ \Psi_{Y_{\theta_{2k-1}}}^{h_{2k-1}} \circ ... \circ \Psi_{X_{\theta_2}}^{h_2} \circ \Psi_{Y_{\theta_{1}}}^{h_1}(x)

We impose :

\|\mathcal{N}(x)-\mathcal{N}(y)\|\leq \|x-y\|
\sqrt{1-{h_C}+{h_C}^2}(1+h_E)\leq 1

$$X$$ ,

Label : Plane

$$X+\delta$$,

$$\|\delta\|_2=0.3$$ ,

Label : Cat

# Thank you for the attention

f_i = \nabla U_i + X_S^i
U_i(x) = \int_0^1 x^Tf_i(tx)dt

Then $$F$$ can be approximated with flow maps of gradient and sphere preserving vector fields.