Truncated Newton

Motivation

Suppose we want to find the solution of the linear system

Ax = b

Ax = b

Options:

Gaussian elimination (scaled partial pivoting)
Gram-Schmidt procedure
Householder reduction
Givens reduction

\approx \frac{n^3}{3}

\approx \frac{n^3}{3}

\approx n^3

\approx n^3

\approx \frac{2n^3}{3}

\approx \frac{2n^3}{3}

\approx \frac{4n^3}{3}

\approx \frac{4n^3}{3}

(Unconditionally stable)

n>>0

n>>0

???

Motivation

HINT: If A is positive definite, solving

Ax = b

Ax = b

Is equivalent to finding the minimum of:

\phi(x) = x'Ax + bx

\phi(x) = x'Ax + bx

Motivation

If A were a diagonal matrix, the procedure would be straight forward:

Just find the minimum among each coordinate axis

x_0

x_0

x_1

x_1

Motivation

Of course, in the real world A is not diagonal. So this strategy is not guaranteed to terminate in finite time.

Thankfully, since A is positive definite, there is a matrix S such that

A = SA'S^t

A = SA'S^t

where A' is diagonal, hence, we minimize thru the directions

p_k

p_k

Conjugate Gradient

r_0\gets Ax_0-b,

r_0\gets Ax_0-b,

p_0\gets -r_0,

p_0\gets -r_0,

k\gets 0

k\gets 0

While \quad r_k \neq0

While \quad r_k \neq0

\alpha_k \gets-\frac{r_k^tp_k}{p_k^tAp_k}

\alpha_k \gets-\frac{r_k^tp_k}{p_k^tAp_k}

r_{k+1} \gets Ax_{k+1} - b

r_{k+1} \gets Ax_{k+1} - b

\beta_{k+1} \gets-\frac{r_{k+1}^tAp_k}{p_k^tAp_k}

\beta_{k+1} \gets-\frac{r_{k+1}^tAp_k}{p_k^tAp_k}

x_{k+1} \gets x_k + \alpha_kp_k

x_{k+1} \gets x_k + \alpha_kp_k

p_{k+1} \gets -r_{k+1} + \beta_{k+1}p_k

p_{k+1} \gets -r_{k+1} + \beta_{k+1}p_k

Conjugate Gradient

A remarkable property, of CG is that the number of iterations is bounded above by the number of eigenvalues of A. More over, if A has eivenvalues

\lambda_1\leq \lambda_2\leq...\leq\lambda_n

\lambda_1\leq \lambda_2\leq...\leq\lambda_n

Then:

\|x_{k+1}-x^*\|_A\leq2\bigg(\frac{\sqrt{\frac{\lambda_n}{\lambda_1}}-1}{\sqrt{\frac{\lambda_n}{\lambda_1}}+1}\bigg)^2\|x_0-x^*\|_A

\|x_{k+1}-x^*\|_A\leq2\bigg(\frac{\sqrt{\frac{\lambda_n}{\lambda_1}}-1}{\sqrt{\frac{\lambda_n}{\lambda_1}}+1}\bigg)^2\|x_0-x^*\|_A

Stochastic Truncated Newton with Conjugate Gradient

Presentation Overview

Newton Step

Motivation

Newton Step

Newton Step

Conjugate Gradient

Motivation

Motivation

Motivation

Motivation

Conjugate Gradient

Conjugate Gradient

Stochastic Truncated Newton with Conjugate Gradient

Newton CG

Newton CG

Newton CG

Conclusion