# Recap

[\mathbf{v}]_S
S^{-1}
[A^k\mathbf{v}_0]_S
\Lambda^k
O(n^2)
A^{k}\mathbf{v} = S\Lambda^{k}S^{-1}\mathbf{v}
S
O(n^2)
O(nk)
O(kn^3)
O(n^2 + nk + n^2)

### EVD/Diagonalization/Eigenbasis is useful when the same matrix $$A$$ operates on many vectors repeatedly (i.e., if we want to apply $$A^n$$ to many vectors)

(this one time cost is then justified in the long run)
\mathbf{v}
A^k\mathbf{v}
A^k
O(kn^3)
(diagonalisation leads to computational efficiency)

# Recap

[\mathbf{v}]_S
S^{-1}
[A^k\mathbf{v}_0]_S
\Lambda^k
O(n^2)
A^{k}\mathbf{v} = S\Lambda^{k}S^{-1}\mathbf{v}
S
O(n^2)
O(nk)
O(kn^3)
O(n^2 + nk + n^2)
\mathbf{v}
A^k\mathbf{v}
A^k
O(kn^3)
(diagonalisation leads to computational efficiency)

### Even better for symmetric matrices

A = Q\Lambda Q^\top
(orthonormal basis)

# Wishlist

### Can we diagonalise rectangular matrices?

\underbrace{A}_{m\times n}\underbrace{\mathbf{x}}_{n \times 1} = \underbrace{U}_{m\times m}~\underbrace{\Sigma}_{m\times n}~\underbrace{V^\top}_{n\times n}\underbrace{\mathbf{x}}_{n \times 1}

### Translate back to the standard basis

(all off-diagonal elements are 0)
(orthonormal)
(orthonormal)

### Recap: square matrices

A = S\Lambda S^{-1}
A = Q\Lambda Q^\top
(symmetric)

### Yes, we can!

(true for all matrices)

# The 4 fundamental subspaces: basis

### $$\mathbf{v_1, v_2, \dots, v_r, v_{r+1}, \dots, v_n}$$ are orthonormal

(same argument)

### $$\mathbf{v_1, v_2, \dots, v_r, v_{r+1}, \dots, v_n}$$ are orthonormal

A\mathbf{v_i} = \sigma_i\mathbf{u_i}~~\forall i\leq r
\therefore A \begin{bmatrix} \uparrow&\uparrow&\uparrow \\ \mathbf{v}_1&\dots&\mathbf{v}_r \\ \downarrow&\downarrow&\downarrow \\ \end{bmatrix} = \begin{bmatrix} \uparrow&\uparrow&\uparrow \\ \mathbf{u}_1&\dots&\mathbf{u}_r \\ \downarrow&\downarrow&\downarrow \\ \end{bmatrix} \begin{bmatrix} \sigma_1&\dots&0 \\ 0&\dots&0 \\ 0&\dots&\sigma_r \\ \end{bmatrix}

# The 4 fundamental subspaces: basis

A \begin{bmatrix} \uparrow&\uparrow&\uparrow \\ \mathbf{v}_1&\dots&\mathbf{v}_r \\ \downarrow&\downarrow&\downarrow \\ \end{bmatrix} = \begin{bmatrix} \uparrow&\uparrow&\uparrow \\ \mathbf{u}_1&\dots&\mathbf{u}_r \\ \downarrow&\downarrow&\downarrow \\ \end{bmatrix} \begin{bmatrix} \sigma_1&\dots&0 \\ 0&\dots&0 \\ 0&\dots&\sigma_r \\ \end{bmatrix}

# Finding $$U$$ and $$V$$

\underbrace{A}_{m\times n}~\underbrace{V_r}_{n \times r} = \underbrace{U_r}_{m \times r}~\underbrace{\Sigma}_{r \times r}
(we don't know what such V and U are - we are just hoping that they exist)
\therefore A \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\uparrow \\ \mathbf{v}_1&\dots&\mathbf{v}_{r}&\mathbf{v}_{r+1}&\dots&\mathbf{v}_n \\ \downarrow&\downarrow&\downarrow&\downarrow&\downarrow \\ \end{bmatrix}= \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\uparrow&\uparrow \\ \mathbf{u}_1&\dots&\mathbf{u}_r&\mathbf{u}_{r+1}&\dots&\mathbf{u}_m \\ \downarrow&\downarrow&\downarrow&\downarrow&\downarrow&\downarrow \\ \end{bmatrix} \begin{bmatrix} \sigma_1&\dots&0&0&0 \\ 0&\dots&0&0&0 \\ 0&\dots&\sigma_r&0&0 \\ 0&\dots&0&0&0 \\ 0&\dots&0&0&0 \\ \end{bmatrix}

### If $$V_r$$ and $$U_r$$ exist then

null space
First r columns of this product will be     and the last n-r columns will be 0
n-r 0 colums
m-r 0 rows
\underbrace{A}_{m\times n}~\underbrace{V}_{n \times n} = \underbrace{U}_{m \times m}~\underbrace{\Sigma}_{m \times n}

### $$V$$ and $$U$$ also exist

The last m-r columns of U will not contribute and hence the first r columns will be the same as     and the last n-r columns will be 0
U_r \Sigma
AV_r

# Finding $$U$$ and $$V$$

AV=U\Sigma
A=U\Sigma V^\top
A^\top A=(U\Sigma V^\top)^\top U\Sigma V^\top
A^\top A=V\Sigma^\top U^\top U\Sigma V^\top
A^\top A=V\Sigma^\top\Sigma V^\top
diagonal
orthogonal
orthogonal

### $$V$$ is thus the matrix of the $$n$$ eigen vectors of $$A^\top A$$

we know that this always exists because A'A is a symmetric matrix
AV=U\Sigma
A=U\Sigma V^\top
AA^\top =U\Sigma V^\top(U\Sigma V^\top)^\top
AA^\top =U\Sigma V^\top V\Sigma^\top U^\top
AA^\top=U\Sigma\Sigma^\top U^\top
diagonal
orthogonal
orthogonal

### $$U$$ is thus the matrix of the $$m$$ eigen vectors of $$AA^\top$$

we know that this always exists because AA' is a symmetric matrix

### $$\Sigma^\top\Sigma$$ contains the eigenvalues of $$A^\top A$$

HW5:Prove that the non-0 eigenvalues of AA' and A'A are always equal

# Finding $$U$$ and $$V$$

\underbrace{A}_{m\times n} = \underbrace{U}_{m\times m}~\underbrace{\Sigma}_{m\times n}~\underbrace{V^\top}_{n\times n}
eigenvectors of AA'
transpose of the eigenvectors of A'A
square root of the eigenvalues of A'A or AA'

### $$\because U~and~V$$ always exist, the SVD of any matrix $$A$$ is always possible

since they are eigenvectors of a symmetric matrix

# Some questions

\therefore A \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\uparrow \\ \mathbf{v}_1&\dots&\mathbf{v}_{r}&\mathbf{v}_{r+1}&\dots&\mathbf{v}_n \\ \downarrow&\downarrow&\downarrow&\downarrow&\downarrow \\ \end{bmatrix}= \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\uparrow&\uparrow \\ \mathbf{u}_1&\dots&\mathbf{u}_r&\mathbf{u}_{r+1}&\dots&\mathbf{u}_m \\ \downarrow&\downarrow&\downarrow&\downarrow&\downarrow&\downarrow \\ \end{bmatrix} \begin{bmatrix} \sigma_1&\dots&0&0&0 \\ 0&\dots&0&0&0 \\ 0&\dots&\sigma_r&0&0 \\ 0&\dots&0&0&0 \\ 0&\dots&0&0&0 \\ \end{bmatrix}

# Some questions

\therefore A \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\uparrow \\ \mathbf{v}_1&\dots&\mathbf{v}_{r}&\mathbf{v}_{r+1}&\dots&\mathbf{v}_n \\ \downarrow&\downarrow&\downarrow&\downarrow&\downarrow \\ \end{bmatrix}= \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\uparrow&\uparrow \\ \mathbf{u}_1&\dots&\mathbf{u}_r&\mathbf{u}_{r+1}&\dots&\mathbf{u}_m \\ \downarrow&\downarrow&\downarrow&\downarrow&\downarrow&\downarrow \\ \end{bmatrix} \begin{bmatrix} \sigma_1&\dots&0&0&0 \\ 0&\dots&0&0&0 \\ 0&\dots&\sigma_r&0&0 \\ 0&\dots&0&0&0 \\ 0&\dots&0&0&0 \\ \end{bmatrix}

### How do we know that these form a basis for the rowspace of A?

\underbrace{~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~}
\underbrace{~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~}
so far we only know that these are the eigenvectors of AA'
so far we only know that these are the eigenvectors of A'A
Please work this out! You really need to see this on your own! HW5

# Why do we care about SVD?

A=U\Sigma V^\top
\therefore A= \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\uparrow \\ \mathbf{u}_1&\dots&\mathbf{u}_r&\dots&\mathbf{u}_m \\ &\\ &\\ \downarrow&\downarrow&\downarrow&\downarrow&\downarrow\\ \end{bmatrix} \begin{bmatrix} \sigma_1&\dots&0&0&0&0 \\ 0&\dots&0&0&0&0 \\ 0&\dots&\sigma_r&0&0&0 \\ 0&\dots&0&0&0&0 \\ 0&\dots&0&0&0&0 \\ \end{bmatrix} \begin{bmatrix} \leftarrow&\dots&\mathbf{v}_{1}^\top&\cdots&\dots&\rightarrow \\ &\\ \leftarrow&\dots&\mathbf{v}_{r}^\top&\cdots&\dots&\rightarrow \\ &\\ &\\ \leftarrow&\dots&\mathbf{v}_{n}^\top&\cdots&\dots&\rightarrow \\ \end{bmatrix}
\therefore A= \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\uparrow \\ \sigma_1\mathbf{u}_1&\dots&\sigma_r\mathbf{u}_r&\dots&0 \\ &\\ &\\ \downarrow&\downarrow&\downarrow&\downarrow&\downarrow\\ \end{bmatrix} \begin{bmatrix} \leftarrow&\dots&\mathbf{v}_{1}^\top&\cdots&\dots&\rightarrow \\ &\\ \leftarrow&\dots&\mathbf{v}_{r}^\top&\cdots&\dots&\rightarrow \\ &\\ &\\ \leftarrow&\dots&\mathbf{v}_{n}^\top&\cdots&\dots&\rightarrow \\ \end{bmatrix}
\therefore A=\sigma_1\mathbf{u_1}\mathbf{v_1}^\top+\sigma_2\mathbf{u_2}\mathbf{v_2}^\top+\cdots+\sigma_r\mathbf{u_r}\mathbf{v_r}^\top
n-r 0 columns

# Why do we care about SVD?

A=U\Sigma V^\top
\therefore A=\sigma_1\mathbf{u_1}\mathbf{v_1}^\top+\sigma_2\mathbf{u_2}\mathbf{v_2}^\top+\cdots+\sigma_r\mathbf{u_r}\mathbf{v_r}^\top
largest sigma
smallest sigma
we can sort these terms according to sigmas

# Best rank-k approximation

||A||_F = \sqrt{\sum_{i=1}^m\sum_{j=1}^n |A_{ij}|^2}
A=\sigma_1\mathbf{u_1}\mathbf{v_1}^\top+\sigma_2\mathbf{u_2}\mathbf{v_2}^\top+\cdots+\sigma_k\mathbf{u_k}\mathbf{v_k}^\top+\cdots+\sigma_r\mathbf{u_r}\mathbf{v_r}^\top
Frobenius norm
\hat{A}_k=\sigma_1\mathbf{u_1}\mathbf{v_1}^\top+\sigma_2\mathbf{u_2}\mathbf{v_2}^\top+\cdots+\sigma_k\mathbf{u_k}\mathbf{v_k}^\top
rank-k approximation of A - dropped the last r - k terms

### i.e. $$||A - \hat{A}_k||_F$$ is minimum when

\hat{A}_k=U_k\Sigma_kV^T_k
we will not prove this