Deep learning mathematics

Institute for quantitative theory and methods (QTM)
Jeremy Jacobson
Lecturer

Image by Chris Benson



Deep neural networks (DNNs)
95% of neural network inference workload in Google datacenters
https://arxiv.org/ftp/arxiv/papers/1704/1704.04760.pdf
Table 1 appears on next slide

Neural networks
-
Mathematical intuition
-
Definition
Kolmogorov
Every continuous function of several variables defined on the unit cube can be represented as a superposition of continuous functions of one variable and the operation of addition (1957).
f(x_1,x_2, \ldots, x_n) = \sum\limits_{i=1}^{2n+1}f_i(\sum\limits_{j=1}^{n}\phi_{i,j}(x_j))


f_1
f_i
f_{2n+1}
f(x_1,x_2, \ldots, x_n) = \sum\limits_{i=1}^{2n+1}f_i(\sum\limits_{j=1}^{n}\phi_{i,j}(x_j))
x_1
x_2
x_n
\phi_{1,n}
\phi_{2n+1,n}
\phi_{2n+1,1}
\phi_{1,1}
\phi_{1,2}
\phi_{2n+1,2}
f
f(x_1,x_2,\cdots,x_n) = \phi(\sum\limits_{i=1}^n w_i x_i+\theta)
w_1
w_2
w_n
x_1
x_2
x_n
\mathbb{R}^n \stackrel{f}{\rightarrow}\mathbb{R}^1
-
one ''hidden layer"
-
one "node"
-
"activation" phi
-
"threshold" theta
\phi
Definition of a feedforward neural network
Definition of a feedforward neural network
f(x_1,x_2,\cdots,x_n) = \sum\limits_{i=1}^{2}W_i\phi_i(\sum\limits_{j=1}^n w_{i,j} x_i+\theta_i)
w_{1,1}
w_{1,2}
w_{1,n}
x_1
x_2
x_n
\mathbb{R}^n \stackrel{f}{\rightarrow}\mathbb{R}^1
-
one ''hidden layer"
-
two "nodes"
W_1
W_2
w_{2,1}
w_{2,2}
w_{2,n}
Definition of a feedforward neural network
(vector notation)
f(\vec{x}) = \vec{W}^T\phi(\vec{w}^T\vec{x}+\vec{\theta})+\eta
\vec{x}
\mathbb{R}^n \stackrel{f}{\rightarrow}\mathbb{R}^1
\vec{W}
\vec{w}

Neural network approach to counting real roots of polynomial systems

Mourrain, Pavlidis, Tasoulis,Vrahatis:
univariate polynomials of degree 2
ax^2+bx+c=0
x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}

Reproducing results using Google's TensorFlow and ML workbench
-
Google Cloud Platform Datalab (https://cloud.google.com/datalab/)
-
TensorFlow
-
and high-level framework
import google.datalab.contrib.mlworkbench.commands
Our results:
Class | Classification Accuracy |
---|---|
Class 1: Zero real roots | 99.17 % |
Class 2: Two real roots | 100% |


Thank you!
Deep learning mathematics
By Jeremy Jacobson
Deep learning mathematics
- 286