# Day 27:

Proof of Singular Value Decomposition

Theorem (Singular Value Decomposition - part 1). If $$B$$ is an $$m\times n$$ matrix and $$p=\min\{m,n\}$$, then there are orthonormal bases $$\{u_{1},\ldots,u_{m}\}$$ for $$\R^{m}$$ and $$\{v_{1},\ldots,v_{n}\}$$ for $$\R^{n}$$, and nonnegative scalars $$\sigma_{1},\ldots,\sigma_{p}$$ such that

$Bv_{i} = \begin{cases} \sigma_{i}u_{i} & i\leq p\\ 0, & i>p.\end{cases}$

Notes:

• The numbers $$\sigma_{1},\sigma_{2},\ldots,\sigma_{p}$$ are called the singular values of $$B$$.
• The sequence of singular values is denoted with the symbol $$\sigma(B)$$.
• The vectors $$v_{1},v_{2},\ldots,v_{n}$$ are called the right singular vectors of $$B$$
• The vectors $$u_{1},u_{2},\ldots,u_{m}$$ are called the left singular vectors of $$B$$.

Proof. Let $$v_{1},\ldots,v_{n}$$ be an orthonormal basis of eigenvectors of $$B^{\top}B$$ with associated eigenvalues $$\lambda_{1}\geq \lambda_{2}\geq\ldots\geq \lambda_{n}\geq 0.$$

Let $$r$$ be the rank of $$B,$$ so that $$\lambda_{r}>0$$ and $$\lambda_{r+1}=0$$ (if $$r<n$$).

For each $$i\in\{1,\ldots,r\}$$ define $u_{i} = \frac{Bv_{i}}{\sqrt{\lambda_{i}}}.$ Note that $u_{i}\cdot u_{j} = \frac{1}{\sqrt{\lambda_{i}\lambda_{j}}}(Bv_{i})^{\top}(Bv_{j}) = \frac{1}{\sqrt{\lambda_{i}\lambda_{j}}}v_{i}^{\top}B^{\top}Bv_{j} = \frac{1}{\sqrt{\lambda_{i}\lambda_{j}}}v_{i}^{\top}\lambda_{j}v_{j}$

Proof continued. $u_{i}\cdot u_{j} = \frac{1}{\sqrt{\lambda_{i}\lambda_{j}}}v_{i}^{\top}\lambda_{j}v_{j} = \frac{\lambda_{j}}{\sqrt{\lambda_{i}\lambda_{j}}} v_{i}\cdot v_{j} = \begin{cases} 1 & i=j,\\ 0 & i\neq j\end{cases}$

From this we see that $$\{u_{1},\ldots,u_{r}\}$$ is an orthonormal set. Next, note that $BB^{\top} u_{i} = BB^{\top}\frac{Bv_{i}}{\sqrt{\lambda_{i}}} = \frac{1}{\sqrt{\lambda_{i}}}BB^{\top}Bv_{i} = \frac{B\lambda_{i}v_{i}}{\sqrt{\lambda_{i}}} = \lambda_{i}u_{i}.$

Since $$r=\operatorname{rank}(B) = \text{rank}(BB^{\top})$$ we see that $$\{u_{1},\ldots,u_{r}\}$$is an orthonormal basis for $$C(BB^{\top})$$. We can complete this to an orthonormal basis $$\{u_{1},\ldots,u_{m}\}$$. By the definition of $$u_{i}$$ for $$i=1,\ldots,r$$ we have

$Bv_{i} = \sqrt{\lambda_{i}}u_{i}$

Setting $$\sigma_{i} = \sqrt{\lambda_{i}}$$ this completes the proof. $$\Box$$

Example. Consider the matrix

$B = \begin{bmatrix} 7 & 3 & 7 & 3\\ 3 & 7 & 3 & 7\end{bmatrix}$

We compute

$B^{\top}B = \begin{bmatrix} 58 & 42 & 58 & 42\\ 42 & 58 & 42 & 58\\ 58 & 42 & 58 & 42\\ 42 & 58 & 42 & 58\end{bmatrix}$

Since this matrix is positive semidefinite (not definite) we have the spectral decomposition:

$B^{\top}B = \frac{1}{2}\begin{bmatrix} 1 & 1 & 1 & 1\\ 1 & -1 & 1 & -1\\ 1 & 1 & -1 & -1\\ 1 & -1 & -1 & 1\end{bmatrix}\begin{bmatrix} 200 & 0 & 0 & 0\\ 0 & 32 & 0 & 0 \\ 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0\end{bmatrix}\frac{1}{2}\begin{bmatrix} 1 & 1 & 1 & 1\\ 1 & -1 & 1 & -1\\ 1 & 1 & -1 & -1\\ 1 & -1 & -1 & 1\end{bmatrix}$

Note that the spectral decompositon is not unique.

Example. Consider the matrix

$B = \begin{bmatrix} 7 & 3 & 7 & 3\\ 3 & 7 & 3 & 7\end{bmatrix}$

We compute

$B^{\top}B = \begin{bmatrix} 58 & 42 & 58 & 42\\ 42 & 58 & 42 & 58\\ 58 & 42 & 58 & 42\\ 42 & 58 & 42 & 58\end{bmatrix}$

Since this matrix is positive semidefinite (not definite) we have the spectral decomposition:

$$B^{\top}B =$$$\frac{1}{2}\begin{bmatrix} 1 & 1 & \sqrt{2} & 0\\ 1 & -1 & 0 & \sqrt{2}\\ 1 & 1 & -\sqrt{2} & 0\\ 1 & -1 & 0 & -\sqrt{2}\end{bmatrix}\begin{bmatrix} 200 & 0 & 0 & 0\\ 0 & 32 & 0 & 0 \\ 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0\end{bmatrix}\frac{1}{2}\begin{bmatrix} 1 & 1 & 1 & 1\\ 1 & -1 & 1 & -1\\ \sqrt{2} & 0 & -\sqrt{2} & 0\\ 0 & \sqrt{2} & 0 & -\sqrt{2} \end{bmatrix}$

Note that the spectral decompositon is not unique.

Example continued. Let

$V = \frac{1}{2}\begin{bmatrix} 1 & 1 & 1 & 1\\ 1 & -1 & 1 & -1\\ 1 & 1 & -1 & -1\\ 1 & -1 & -1 & 1\end{bmatrix}$

and let $$v_{i}$$ denote the $$i$$th column of $$V$$. Since the columns of $$V$$ are a basis of eigenvectors of $$B^{\top}B$$, we see that these are the right singular vectors of $$B$$.

From the proof, we can see that the singular values of $$B$$ are

$\sigma_{1} = \sqrt{200} = 10\sqrt{2}\quad\text{and}\quad\sigma_{2} = \sqrt{32} = 4\sqrt{2},$

and the left singular vectors are

$u_{1} = \frac{1}{10\sqrt{2}}Bv_{1} = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\ 1\end{bmatrix}\quad\text{and}\quad u_{2} = \frac{1}{4\sqrt{2}}Bv_{2} = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\ -1\end{bmatrix}$

Theorem (Singular Value Decomposition - Outer product form). If $$B$$ is an $$m\times n$$ matrix and $$p=\min\{m,n\}$$, then there are orthonormal bases $$\{u_{1},\ldots,u_{m}\}$$ for $$\R^{m}$$ and $$\{v_{1},\ldots,v_{n}\}$$ for $$\R^{n}$$, and nonnegative scalars $$\sigma_{1},\ldots,\sigma_{p}$$ such that

$B = \sum_{i=1}^{p}\sigma_{i}u_{i}v_{i}^{\top}.$

Proof. From the previous theorem we have orthonormal bases $$\{u_{1},\ldots,u_{m}\}$$ for $$\mathbb{R}^{m}$$ and $$\{v_{1},\ldots,v_{n}\}$$ for $$\mathbb{R}^{n}$$ and nonnegative scalars $$\sigma_{1},\ldots,\sigma_{p}$$ such that $$Bv_{i}=\sigma_{i}u_{i}$$ for all $$i\leq p$$ and $$Bv_{i} = 0$$ for $$i>p$$.

Define the matrix

$C: = \sum_{i=1}^{p}\sigma_{i}u_{i}v_{i}^{\top}$

Note that for $$i_{0}\leq p$$, since $$v_{1},\ldots,v_{n}$$ is orthonormal

$Cv_{i_{0}} = \sum_{i=1}^{p}\sigma_{i}u_{i}v_{i}^{\top}v_{i_{0}} = \sigma_{i_{0}}u_{i_{0}} = Bv_{i_{0}}$

Proof continued. If $$i_{0}>p$$ then $$v_{i_{0}}$$ is orthogonal to $$v_{i}$$ for all $$i\leq p$$ and hence

$Cv_{i_{0}} = 0 = Bv_{i_{0}}.$

Thus, we see that $$Cv_{i} = Bv_{i}$$ for all $$i\in\{1,2,\ldots,n\}$$.

Next, suppose that there is a vector $$v\in\mathbb{R}^{n}$$ such that $$Cv\neq Bv$$. This implies that $$(C-B)v \neq 0$$. Since $$v_{1},v_{2},\ldots,v_{n}$$ is a basis for $$\mathbb{R}^{n}$$, there are scalars $$\alpha_{1},\alpha_{2},\ldots,\alpha_{n}$$ such that

$v = \alpha_{1}v_{1} + \alpha_{2}v_{2} + \cdots + \alpha_{n}v_{n}.$

Finally, we multiply by $$C-B$$ on both sides and we see that

$0\neq (C-B)v = \alpha_{1}(C-B)v_{1} + \alpha_{2}(C-B)v_{2} + \cdots + \alpha_{n}(C-B)v_{n}$

$= \alpha_{1}0 + \alpha_{2}0 + \cdots + \alpha_{n}0 = 0$

This contradiction shows that $$(C-B)v=0$$ for all $$v\in\mathbb{R}^{n}$$, and hence $$C=B$$. $$\Box$$

Example continued. Recall the matrix

$B = \begin{bmatrix} 7 & 3 & 7 & 3\\ 3 & 7 & 3 & 7\end{bmatrix}$

Hence, the outer product form of the singular value decomposition of $$B$$ is $B = \sigma_{1}u_{1}v_{1}^{\top} + \sigma_{2}u_{2}v_{2}^{\top}=\begin{bmatrix} 5 & 5 & 5 & 5\\ 5 & 5 & 5 & 5\end{bmatrix} + \begin{bmatrix} 2 & -2 & 2 & -2\\ -2 & 2 & -2 & 2\end{bmatrix}$

We already saw that the right singular vectors of $$B$$ are $v_{1} = \frac{1}{2}\left[\begin{array}{r} 1\\ 1\\ 1\\ 1\end{array}\right],\ v_{2} = \frac{1}{2}\left[\begin{array}{r} 1\\ -1\\ 1\\ -1\end{array}\right],\ v_{3} = \frac{1}{2}\left[\begin{array}{r} 1\\ 1\\ -1\\ -1\end{array}\right],\ v_{4} = \frac{1}{2}\left[\begin{array}{r} 1\\ -1\\ -1\\ 1\end{array}\right],$ the singular values of $$B$$ are $\sigma_{1} = \sqrt{200} = 10\sqrt{2}\quad\text{and}\quad\sigma_{2} = \sqrt{32} = 4\sqrt{2},$ and the left singular vectors are $u_{1} = \frac{1}{10\sqrt{2}}Bv_{1} = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\ 1\end{bmatrix}\quad\text{and}\quad u_{2} = \frac{1}{4\sqrt{2}}Bv_{2} = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\ -1\end{bmatrix}$

Theorem (Singular Value Decomposition - Matrix Form). If $$B$$ is an $$m\times n$$ matrix, then there is an $$m\times m$$ orthogonal matrix $$R$$, an $$n\times n$$ orthogonal matrix $$Q$$, and an $$m\times n$$ diagonal matrix $$\Sigma$$ such that

$B = R\Sigma Q^{\top}.$

Proof. By the outer product form of the singular value decomposition there are orthonormal bases $$\{u_{1},\ldots,u_{m}\}$$ for $$\R^{m}$$ and $$\{v_{1},\ldots,v_{n}\}$$ for $$\R^{n}$$, and nonnegative scalars $$\sigma_{1},\ldots,\sigma_{p}$$ such that

$B = \sum_{i=1}^{p}\sigma_{i}u_{i}v_{i}^{\top}.$

Let $$R$$ be the matrix with columns $$u_{1},\ldots,u_{m}$$, and let $$Q$$ be the matrix with columns $$v_{1},\ldots,v_{n}$$. Since the columns of $$R$$ and $$Q$$ are orthonormal bases, these matrices are orthogonal.

Let $$\Sigma$$ be $$m\times n$$ diagonal matrix with entries $$\sigma_{1},\sigma_{2},\ldots,\sigma_{p}$$ on the diagonal, and the rest of the diagonal entries equal to zero.

Proof continued.

$$R\Sigma Q^{\top}$$$=\begin{bmatrix} \vert & \vert & & \vert\\ u_{1} & u_{2} & \cdots & u_{m}\\ \vert & \vert & & \vert\end{bmatrix}\begin{bmatrix} \sigma_{1} & & & & & &\\ & \sigma_{2} & & & & &\\ & & \ddots & & & & \\ & & & \sigma_{p} & & & \\ & & & & 0 & &\\ & & & & & \ddots & \\ & & & & & & 0\end{bmatrix}\begin{bmatrix} - & v_{1}^{\top} & - \\ - & v_{2}^{\top} & - \\ & \vdots & \\ - & v_{n}^{\top} & -\end{bmatrix}$

$=\begin{bmatrix} \vert & \vert & & \vert\\ u_{1} & u_{2} & \cdots & u_{m}\\ \vert & \vert & & \vert\end{bmatrix}\begin{bmatrix} - & \sigma_{1} v_{1}^{\top} & - \\ - & \sigma_{2}v_{2}^{\top} & - \\ & \vdots & \\ - & \sigma_{p}v_{p}^{\top} & - \\ - & 0 & -\\ & \vdots & \\ - & 0 & -\end{bmatrix} = \sum_{i=1}^{p}\sigma_{i}u_{i}v_{i}^{\top} = B.$

$$\Box$$

Continuing the last Example. We had

$B = 10\sqrt{2}u_{1}v_{1}^{\top} + 4\sqrt{2}u_{2}v_{2}^{\top}=\begin{bmatrix} 5 & 5 & 5 & 5\\ 5 & 5 & 5 & 5\end{bmatrix} + \left[\begin{array}{rrrr} 2 & -2 & 2 & -2\\ -2 & 2 & -2 & 2\end{array}\right]$

$$v_{1} = \frac{1}{2}\begin{bmatrix} 1\\ 1\\ 1\\ 1\end{bmatrix},\ v_{2} = \frac{1}{2}\begin{bmatrix} \phantom{-}1\\ -1\\ \phantom{-}1\\ -1\end{bmatrix},\ v_{3} = \frac{1}{2}\begin{bmatrix} \phantom{-}1\\ \phantom{-}1\\ -1\\ -1\end{bmatrix},\ v_{4} = \frac{1}{2}\begin{bmatrix} -1\\ \phantom{-}1\\ \phantom{-}1\\ -1\end{bmatrix}$$

and

$$u_{1} = \frac{1}{\sqrt{2}}\begin{bmatrix} 1\\ 1\end{bmatrix},\ u_{2} = \frac{1}{\sqrt{2}}\begin{bmatrix} \phantom{-}1\\ -1\end{bmatrix}$$

$$B = \begin{bmatrix} 7 & 3 & 7 & 3\\ 3 & 7 & 3 & 7\end{bmatrix}$$$= \left(\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & \phantom{-}1\\ 1 & -1\end{bmatrix}\right)\begin{bmatrix} 10\sqrt{2} & 0 & 0 & 0\\ 0 & 4\sqrt{2} & 0 & 0\end{bmatrix}\left(\frac{1}{2}\left[\begin{array}{rrrr} 1 & 1 & 1 & 1\\ 1 & -1 & 1 & -1\\ 1 & 1 & -1 & -1\\ 1 & -1 & -1 & 1\end{array}\right]\right)$

The matrix form or the singular value decomposition of $$B$$ is

Example 2. Let

$B = \begin{bmatrix} 2 & -1\\ 2 & 1\end{bmatrix}.$

$B^{\top}B = \begin{bmatrix} 8 & 0\\ 0 & 2\end{bmatrix} = \begin{bmatrix} 1 & 0\\ 0 & 1\end{bmatrix}\begin{bmatrix} 8 & 0\\ 0 & 2\end{bmatrix}\begin{bmatrix} 1 & 0\\ 0 & 1\end{bmatrix}$

$v_{1} = \begin{bmatrix}1\\ 0\end{bmatrix}\quad\text{and}\quad v_{2} = \begin{bmatrix} 0\\ 1\end{bmatrix}$

$u_{1} = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\ 1\end{bmatrix}\quad\text{and}\quad u_{2} = \frac{1}{\sqrt{2}}\begin{bmatrix} -1\\ 1\end{bmatrix}$

$B = 2\sqrt{2}\left(\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 0\\ 1 & 0\end{bmatrix}\right) + \sqrt{2}\left(\frac{1}{\sqrt{2}}\begin{bmatrix} 0 & -1\\ 0 & 1\end{bmatrix}\right)$

$B = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 & -1\\ 1 & 1\end{bmatrix}\begin{bmatrix} 2\sqrt{2} & 0\\ 0 & \sqrt{2}\end{bmatrix}\begin{bmatrix}1 & 0\\ 0 & 1\end{bmatrix}$

Matrix form:

Outer product form:

Example 3. Let

$B = \begin{bmatrix} 1 & 1\\ 1 & -1\\ 1 & 1\end{bmatrix},$

then

$B^{\top}B = \begin{bmatrix} 3 & 1\\ 1 & 3\end{bmatrix} = \left(\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1\\ 1 & -1\end{bmatrix}\right)\begin{bmatrix}4 & 0\\ 0 & 2\end{bmatrix} \left(\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1\\ 1 & -1\end{bmatrix}\right)$

Note that

$v_{1} = \frac{1}{\sqrt{2}}\begin{bmatrix} 1\\ 1\end{bmatrix}\quad\text{and}\quad v_{2} = \frac{1}{\sqrt{2}}\begin{bmatrix} 1\\ -1\end{bmatrix}$

are a basis of right singular vectors of $$B$$. The singular values of $$B$$ are $$2$$ and $$\sqrt{2}$$. Two of the left singular vectors are

$u_{1} = \frac{Bv_{1}}{2} = \frac{1}{\sqrt{2}}\begin{bmatrix} 1\\ 0\\ 1\end{bmatrix}\quad\text{and}\quad u_{2} = \frac{Bv_{2}}{\sqrt{2}} = \begin{bmatrix} 0\\ 1\\ 0\end{bmatrix}.$

However, $$\{u_{1},u_{2}\}$$ is not a basis of left singular vectors. We must complete it to a basis!

Example 3 continued.  We need an orthonormal basis for $$\operatorname{span}\{u_{1},u_{2}\}^{\bot}$$. If we make the matrix

$X = \begin{bmatrix} u_{1}^{\top}\\ u_{2}^{\top}\end{bmatrix}$

Then $$\operatorname{span}\{u_{1},u_{2}\}^{\bot} = N(X).$$

We find a basis for $$N(X)$$. Then, if necessary, use Gram-Schmidt to find an orthonormal basis for $$N(X)$$. In this case we find that

$u_{3} = \frac{1}{\sqrt{2}}\begin{bmatrix} -1\\ 0\\ 1\end{bmatrix}$

is an orthonormal basis for $$N(X) = \operatorname{span}\{u_{1},u_{2}\}^{\bot}$$. Thus $$\{u_{1},u_{2},u_{3}\}$$ is an orthonormal basis of left singular vectors of $$B$$. Thus, we have the singular value decompositions:

$B = 2u_{1}v_{1}^{\top} + \sqrt{2}u_{2}v_{2}^{\top}$

Example 3 continued.

$B = \left(\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 0 & -1\\ 0 & \sqrt{2} & 0\\ 1 & 0 & 1\end{bmatrix}\right)\begin{bmatrix} 2 & 0\\ 0 & \sqrt{2}\\ 0 & 0\end{bmatrix}\left(\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1\\ 1 & -1\end{bmatrix}\right)$

Example continued. Let $$\displaystyle{B = \begin{bmatrix} 2 & -1\\ 2 & 1\end{bmatrix}.}$$

### $$\text{red vector}$$

The blue vectors go through all vectors of length $$1$$.

By John Jasper

• 483