Optimization
A very very short introduction
It all starts with a function
z = f(x, y)
z=f(x,y)

What (x,y) values will minimize my function ?
f(\bm{x})
f(x)
What can you tell me about your function ?
Convexity


Convexity guarantees a single global minimum
Differentiability
f^\prime(a) = \lim\limits_{h \rightarrow 0 } \frac{ f(a +h) - f(a) } {h}
f′(a)=h→0limhf(a+h)−f(a)


The IDEA BEHIND Gradient Descent

Follow the slope !
\nabla f = \left[\begin{matrix} \frac{ \partial f}{ \partial x} \\ \frac{ \partial f}{ \partial y} \end{matrix}\right]
∇f=[∂x∂f∂y∂f]
Gradient descent algorithm
Start from some point x
- Compute the direction of the gradient
- Take a step in that direction
- Go back to 1
\bm{x_{n+1}} = \bm{x_{n}} - { \mu} \nabla f( \bm{x_{n}} )
xn+1=xn−μ∇f(xn)
How big of a step should I take?
\bm{x_{n+1}} = \bm{x_{n}} - {\color{red} \mu} \nabla f( \bm{x_{n}} )
xn+1=xn−μ∇f(xn)

- Fixed step:
- If ∇f is L-lipschitz convergence is guaranteed for μ<1/L
- Line search: μ>0minf(x−μ∇f(x))
Let's have a look at scipy.optimize

Newton's method

Newton and quasi-newton
xn+1=xn−[∇2f(xn)]−1∇f(xn)
- Newton's update:
\nabla^2 f = \left[\begin{matrix} \frac{ \partial^2 f}{ \partial x \partial x} & \frac{\partial^2 f}{ \partial x \partial y} \\ \frac{ \partial^2 f}{ \partial y \partial x} & \frac{ \partial^2 f}{ \partial y \partial y} \end{matrix}\right]
∇2f=[∂x∂x∂2f∂y∂x∂2f∂x∂y∂2f∂y∂y∂2f]
Computing and inverting the Hessian can be very costly, quasi-Newton work around it
Why is this useful ?
Deep Neural Networks


Example loss function for regression:
L=∥y−fw(x)∥2
L=i=1∑N(yi−fw(xi))2
Stochastic Gradient Descent

It's not all bad


inverse Problems



Deconvolution
Inpainting
Denoising
y=Ax+n
A is non-invertible or ill-conditioned
Regularization
L=∥y−Ax∥2+R(x)



R(x)=λ∥x∥2
Checkout the numerical tours
Another example



L=∥y−Ax∥2+R(x)
R(x)=λ∥Φx∥1

what i haven't talked about
- Constrained optimization
- Simulated annealing
- NP-Hard problems
- ...
Optimization A very very short introduction
Optimization
By eiffl
Optimization
Practical statistics series
- 979