# Day 22:

More on diagonalizable matrices

The Power Method

### Diagonalizable matrices

Theorem. If $$A$$ is an $$n\times n$$ matrix and $$\{v_{1},\ldots,v_{n}\}$$ is a basis of eigenvectors of $$A$$ with corresponding eigenvalues $$\{\lambda_{1},\ldots,\lambda_{n}\}$$, then

$A = X\Lambda X^{-1}$

where

$X = \begin{bmatrix} | & | & | & & |\\ v_{1} & v_{2} & v_{3} & \cdots & v_{n}\\ | & | & | & & |\end{bmatrix}$

and

$\Lambda = \text{diag}(\lambda_{1},\lambda_{2},\ldots,\lambda_{n}) = \begin{bmatrix} \lambda_{1} & 0 & \cdots & 0\\ 0 & \lambda_{2} & & \vdots\\ \vdots & & \ddots & 0\\ 0 & \cdots & 0 & \lambda_{n}\end{bmatrix}$

### Diagonalizable matrices

Proof.  We will show that $$Ax=X\Lambda X^{-1} x$$ for an arbitrary $$x\in\R^{n}$$. Since $$\{v_{1},\ldots, v_{n}\}$$ is a basis, there are scalars $$\alpha_{1},\ldots,\alpha_{n}$$ such that

$x = \alpha_{1}v_{1} + \alpha_{2}v_{2} + \cdots + \alpha_{n}v_{n}$

Multiplying by $$A$$ on the left we have

$Ax = \alpha_{1}Av_{1} + \alpha_{2}Av_{2} + \cdots + \alpha_{n}Av_{n}$

$= \alpha_{1}\lambda_{1}v_{1} + \alpha_{2}\lambda_{2}v_{2} + \cdots + \alpha_{n}\lambda_{n}v_{n}$

By the definition of $$X$$ we see that $$Xe_{i} = v_{i}$$, and hence $$X^{-1}v_{i} = e_{i}$$ for $$i=1,\ldots,n$$. Hence,

$X\Lambda X^{-1}x = X\Lambda X^{-1}\big(\alpha_{1}v_{1} + \alpha_{2}v_{2} + \cdots + \alpha_{n}v_{n}\big)$

$= X\Lambda \big(\alpha_{1}X^{-1}v_{1} + \alpha_{2}X^{-1}v_{2} + \cdots + \alpha_{n}X^{-1}v_{n}\big)$

$= X\Lambda \big(\alpha_{1}e_{1} + \alpha_{2}e_{2} + \cdots + \alpha_{n}e_{n}\big)$

$= X\big(\alpha_{1}\lambda_{1} e_{1} + \alpha_{2}\lambda_{2} e_{2} + \cdots + \alpha_{n}\lambda_{n}e_{n}\big)$

$= \alpha_{1}\lambda_{1} v_{1} + \alpha_{2}\lambda_{2} v_{2} + \cdots + \alpha_{n}\lambda_{n}v_{n}.$

$$\Box$$

### Diagonalizable matrices

Example. Consider the matrix

$A=\left[\begin{array}{rrr} 2 & -1 & 2\\ 0 & 3 & -2\\ 0 & 0 & -1\end{array}\right]$

We want to find $$\lambda\in\R$$ such that

$A-\lambda I=\left[\begin{array}{rrr} 2-\lambda & -1 & 2\\ 0 & 3-\lambda & -2\\ 0 & 0 & -1-\lambda\end{array}\right]$

is not invertible. We know that an uppertriangular matrix is invertible if and only if it has no zeros on the diagonal. Hence, the eigenvalues of $$A$$ are $$2$$, $$3$$, and $$-1$$.

Hence, we need to find each of the subspace $$N(A-\lambda I)$$ for $$\lambda=2,3,$$ and $$-1$$.

### Diagonalizable matrices

$A-(-1) I=\left[\begin{array}{rrr} 3 & -1 & 2\\ 0 & 4 & -2\\ 0 & 0 & 0\end{array}\right]$

Row reducing we get

$\text{rref}(A-(-1)I) = \left[\begin{array}{rrr} 1 &0 & \frac{1}{2}\\ 0 & 1 & -\frac{1}{2}\\ 0 & 0 & 0\end{array}\right]$

Hence, we see that

$N(A-(-1)I) = \text{span}\left\{\begin{bmatrix} -\frac{1}{2}\\ \frac{1}{2}\\ 1\end{bmatrix}\right\}$

Similarly

$N(A-3I) = \text{span}\left\{\begin{bmatrix} -1\\ 1\\ 0\end{bmatrix}\right\}\quad\text{and }\quad N(A-2I) = \text{span}\left\{\begin{bmatrix} 1\\ 0\\ 0\end{bmatrix}\right\}$

### Diagonalizable matrices

Selecting a nonzero vector from each eigenspace and forming the matrix with those columns:

$X=\begin{bmatrix} 1 & -1 & -\frac{1}{2}\\ 0 & 1 & \frac{1}{2}\\ 0 & 0 & 1\end{bmatrix}$

Since the columns of $$X$$ form a basis for $$\R^{3}$$, we see that $$X$$ is invertible. Indeed,

$X^{-1} = \begin{bmatrix} 1 & 1 & 0\\ 0 & 1 & -\frac{1}{2}\\ 0 & 0 & 1\end{bmatrix}.$

Finally,

$\begin{bmatrix} 1 & -1 & -\frac{1}{2}\\ 0 & 1 & \frac{1}{2}\\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix} 2 & 0 & 0\\ 0 & 3 & 0\\ 0 & 0 & -1\end{bmatrix}\begin{bmatrix} 1 & 1 & 0\\ 0 & 1 & -\frac{1}{2}\\ 0 & 0 & 1\end{bmatrix} = \left[\begin{array}{rrr} 2 & -1 & 2\\ 0 & 3 & -2\\ 0 & 0 & -1\end{array}\right]$

Hence, $$A$$ is diagonalizable!

$$X$$

$$\Lambda$$

$$X^{-1}$$

$$A$$

### Diagonalizable matrices

Another Example. Consider the matrix

$A = \left[\begin{array}{rrr} -1 &\ \ 0 & 0\\ -8 & 3 & 8\\ 0 & 0 & -1\end{array}\right].$

The eigenvalues of $$A$$ are $$3$$ and $$-1$$. Diagonalize $$A$$, that is, find an invertible matrix $$X$$ and a diagonal matrix $$\Lambda$$ such that $$A=XAX^{-1}$$.

$\text{rref}(A-3I) = \begin{bmatrix} 1 & 0 & 0\\ 0 & 0 & 1\\ 0 & 0 & 0\end{bmatrix}$

Therefore

$\left\{\begin{bmatrix} 0\\ 1\\ 0\end{bmatrix}\right\}$

is a basis for the eigenspace of $$A$$ associated with the eigevalue $$3$$.

### Diagonalizable matrices

Another Example. Next we note that

$\text{rref}(A-(-1)I) = \begin{bmatrix} 1 & -\frac{1}{2} & -1\\ 0 & 0 & 0\\ 0 & 0 & 0\end{bmatrix}.$

From this we see that

$\left\{\begin{bmatrix} 1/2\\ 1\\ 0\end{bmatrix},\begin{bmatrix} 1\\ 0\\ 1\end{bmatrix}\right\}$

is a basis for the eigenspace of $$A$$ associated with the eigenvalue $$-1$$.

It is simple to verify that

$\left\{\begin{bmatrix} 0\\ 1\\ 0\end{bmatrix},\begin{bmatrix} 1/2\\ 1\\ 0\end{bmatrix},\begin{bmatrix} 1\\ 0\\ 1\end{bmatrix}\right\}$

is a basis of eigenvectors of $$A$$.

### Diagonalizable matrices

Another Example. If we set

$X = \left[\begin{array}{rrr} 0 & \frac{1}{2} & 1\\ 1 & 1 & 0\\ 0 & 0 & 1\end{array}\right]$

$\Lambda = \left[\begin{array}{rrr} 3 & 0 & 0\\ 0 & -1 & 0\\ 0 & 0 & -1\end{array}\right]$

and

then

$X^{-1} = \left[\begin{array}{rrr} -2 &\ \ 1 & 2\\ 2 & 0 & -2\\ 0 & 0 & 1\end{array}\right]$

and

$X\Lambda X^{-1} = \left[\begin{array}{rrr} 0 & \frac{1}{2} & 1\\ 1 & 1 & 0\\ 0 & 0 & 1\end{array}\right]\left[\begin{array}{rrr} 3 & 0 & 0\\ 0 & -1 & 0\\ 0 & 0 & -1\end{array}\right]\left[\begin{array}{rrr} -2 &\ \ 1 & 2\\ 2 & 0 & -2\\ 0 & 0 & 1\end{array}\right]$

$= \left[\begin{array}{rrr} -1 &\ \ 0 & 0\\ -8 & 3 & 8\\ 0 & 0 & -1\end{array}\right] = A$

### Finding Eigenvalues

Theorem. Let $$A$$ be an $$n\times n$$ matrix, and let $$v$$ be a nonzero vector. Define the sequence $$(v_{n})$$ by $$v_{1} = \frac{1}{\|v\|}v$$, and

$v_{k+1} = \frac{Av_{k}}{\|Av_{k}\|}\quad\text{for all }k\in\N.$

If $$A$$ is diagonalizable with eigenvalues $$\lambda_{1},\ldots,\lambda_{n}$$, and $$|\lambda_{1}|>|\lambda_{2}|\geq|\lambda_{3}|\geq\cdots\geq |\lambda_{n}|$$, then the sequence

$v_{1}^{\top}Av_{1}, v_{2}^{\top}Av_{2}, v_{3}^{\top}Av_{3},\ldots,v_{k}^{\top}Av_{k},\ldots$

converges to an eigenvalue of $$A$$ for almost all initial vectors $$v$$.

Example. Let

$A = \left[\begin{array}{rrrr} -1 & -3 & 7 & 5\\ -3 & -1 & 5 & 7\\ 7 & 5 & -1 & -3\\ 5 & 7 & -3 & -1\end{array}\right].$

Set $$v_{1} = \begin{bmatrix} 1 & 0 & 0 & 0\end{bmatrix}^{\top}$$. Then,

$v_{1} = \begin{bmatrix}1\\ 0\\ 0\\ 0\end{bmatrix},\ v_{2} = \begin{bmatrix}-0.1091\\ -0.3273\\ 0.7638\\ 0.5455\end{bmatrix}, v_{3} = \begin{bmatrix} 0.6355\\ 0.5750\\ -0.3329\\ -0.3934 \end{bmatrix}, v_{4} = \begin{bmatrix} -0.4263 \\ -0.4418\\ 0.5658\\ 0.5503 \end{bmatrix},$

$v_{5} = \begin{bmatrix} 0.5322 \\ 0.5283 \\ -0.4659\\ -0.4698 \end{bmatrix},\ldots,v_{12} = \begin{bmatrix} -0.4998\\ -0.4998 \\ 0.5002\\ 0.5002 \end{bmatrix}, v_{13} = \begin{bmatrix} 0.5001\\ 0.5001\\ -0.4999\\ -0.4999 \end{bmatrix},\ldots$

Note that this sequence of vectors is not converging.

Example Continued. However,

$v_{1}^{\top}Av_{1} = -1,\quad v_{2}^{\top}Av_{2} = -10.4762,\quad v_{3}^{\top}Av_{3} = -14.5201,$

$v_{4}^{\top}Av_{4}= -15.6261,\quad v_{5}^{\top}Av_{5} = -15.9063, \quad \ldots$

$v_{12}^{\top}Av_{12} = -15.999994277954329,\quad\ldots\quad,v_{40}^{\top}Av_{40} = -16$

To machine precision

Hence, it looks like $$-16$$ is an eigenvalue of $A = \begin{bmatrix} -1 & -3 & 7 & 5\\ -3 & -1 & 5 & 7\\ 7 & 5 & -1 & -3\\ 5 & 7 & -3 & -1 \end{bmatrix}.$

### Why Does this Power Method Work?

If $$A$$ is diagonaliable, then there is an invertible matrix $$X$$ and a diagonal matrix $$\Lambda$$ such that $$A = X\Lambda X^{-1}.$$

$A^2 = (X\Lambda X^{-1})(X\Lambda X^{-1}) = X\Lambda\Lambda X^{-1} = X\Lambda^{2} X^{-1}$

$A^3 = AA^2 = (X\Lambda X^{-1})(X\Lambda^2 X^{-1}) = X\Lambda\Lambda^2 X^{-1} = X\Lambda^{3} X^{-1}$

$A^4 = AA^3 = (X\Lambda X^{-1})(X\Lambda^3 X^{-1}) = X\Lambda\Lambda^3 X^{-1} = X\Lambda^{4} X^{-1}$

$\vdots$

In general,

$A^{k} = X\Lambda^{k} X^{-1}.$

### Why Does this Power Method Work?

Since $$\Lambda$$ is a diagonal matrix, we have

$\Lambda = \begin{bmatrix} \lambda_{1} & 0 & \cdots & 0\\ 0 & \lambda_{2} & & \vdots\\ \vdots & & \ddots & \vdots\\ 0 & \cdots & \cdots & \lambda_{n}\end{bmatrix}\quad\text{and}\quad \Lambda^{k} = \begin{bmatrix} \lambda_{1}^{k} & 0 & \cdots & 0\\ 0 & \lambda_{2}^{k} & & \vdots\\ \vdots & & \ddots & \vdots\\ 0 & \cdots & \cdots & \lambda_{n}^{k}\end{bmatrix}$

If we assume $$|\lambda_{1}|> |\lambda_{2}|\geq\cdots\geq |\lambda_{n}|$$, then we have

$\frac{A^{k}}{\lambda_{1}^{k}} = X\begin{bmatrix} 1 & 0 & \cdots & 0\\ 0 & (\lambda_{2}/\lambda_{1})^{k} & & \vdots\\ \vdots & & \ddots & \vdots\\ 0 & \cdots & \cdots & (\lambda_{n}/\lambda_{1})^{k}\end{bmatrix} X^{-1} \approx X\begin{bmatrix} 1 & 0 & \cdots & 0\\ 0 & 0 & & \vdots\\ \vdots & & \ddots & \vdots\\ 0 & \cdots & \cdots & 0\end{bmatrix} X^{-1}$

When $$k$$ is very large

Hence, if $$X^{-1}v$$ has a nonzero first coordinate, then $$A^{k}v/\lambda_{1}^{k}$$ is a approximately a scalar multiple of the first column of $$X.$$ That is, $$A^{k}v/\lambda_{1}^{k}\approx \alpha x_{1}$$ and hence $$A^{k}v\approx \beta x_{1}.$$

### Why Does this Power Method Work?

Note that

$v_{k+1} = \frac{A^{k}v}{\|A^{k}v\|}\approx \pm \frac{x_{1}}{\|x_{1}\|}$

Hence,

$v_{k+1}^{\top}Av_{k+1} \approx \left( \pm \frac{x_{1}}{\|x_{1}\|}\right)^{\top}A\left( \pm \frac{x_{1}}{\|x_{1}\|}\right) = \frac{1}{\|x_{1}\|^{2}} x_{1}^{\top}\lambda_{1}x_{1} = \lambda_{1}$

### What can go wrong?

If $$\lambda$$ and $$-\lambda$$ are both eigenvalues:

Example. Consider the matrix $$A = \begin{bmatrix} 1 & 0\\ 0 & -1\end{bmatrix}.$$

If we start with a vector $$v_{1}=\begin{bmatrix} \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}}\end{bmatrix}$$, then

$v_{2} = \frac{Av_{1}}{\|Av_{1}\|^{2}} = \begin{bmatrix} \frac{1}{\sqrt{2}}\\ -\frac{1}{\sqrt{2}}\end{bmatrix},\ v_{3} = \begin{bmatrix} \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}}\end{bmatrix},\ v_{4} = \begin{bmatrix} \frac{1}{\sqrt{2}}\\ -\frac{1}{\sqrt{2}}\end{bmatrix},\ v_{5} = \begin{bmatrix} \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}}\end{bmatrix},\ldots$

Worse yet:

$v_{2}^{\top}Av_{2} = 0,\ v_{3}^{\top}Av_{3} = 0,\ v_{4}^{\top}Av_{4} = 0,\ldots$

### How can we fix it?

Example. Add a bit $$A+I = \begin{bmatrix} 2 & 0\\ 0 & 0\end{bmatrix}.$$

If we start with a vector $$v_{1}=\begin{bmatrix} \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}}\end{bmatrix}$$, then

$v_{2} = \frac{Av_{1}}{\|Av_{1}\|^{2}} = \begin{bmatrix} 1\\ 0\end{bmatrix},\ v_{3} = \begin{bmatrix} 1\\ 0\end{bmatrix},\ v_{4} = \begin{bmatrix} 1\\ 0\end{bmatrix},\ v_{5} = \begin{bmatrix} 1\\ 0\end{bmatrix},\ldots$

And

$v_{2}^{\top}Av_{2} = 2,\ v_{3}^{\top}Av_{3} = 2,\ v_{4}^{\top}Av_{4} = 2,\ldots$

From this we can conclude that $$2-1=1$$ is an eigenvalue of $$A$$.

Be careful, what if the eigenvalues of $$A$$ were $$-2$$ and $$0$$. We wouldn't know that in advance, and adding $$I$$ would create the problem we see here.

### How can we fix it?

Proposition. If $$A$$ is a matrix with eigenvalues $$\lambda_{1},\ldots,\lambda_{n}$$ and associated eigenvectors $$v_{1},\ldots,v_{n}$$, then $$A+cI$$ has eigenvalues $$\lambda_{1}+c,\ldots,\lambda_{n}+c$$ with the same associated eigenvectors.

Proof.

$(A+cI)v_{j} = Av_{j} + cIv_{j} = \lambda_{j}v_{j}+cv_{j} = (\lambda_{j}+c)v_{j}.\quad\Box$

To avoid the problem that $$\lambda$$ and $$-\lambda$$ are both eigenvalues of $$A$$ we first take a random number $$c$$, then use the power method to find the eigenvalues of $$A+cI$$. Subtracting $$c$$ from each of the eigenvalues of $$A+cI$$ we obtain the eigenvalues of $$A.$$

### How do we code it?

Input: Matrix $$A$$

1. Let $$v$$ be a random vector: $$\texttt{v=randn(n,1);}$$
2. Normalize $$v$$: $$\texttt{v=v/norm(v);}$$
3. Compute an approximate eigenvalue: $$\texttt{la = v'*A*v;}$$
4. Check whether $$\texttt{la}$$ is an eigenvalue and $$v$$ is the eigenvector: $$\texttt{if norm(la*v-A*v)<thresh, done; end;}$$
5. If not, update $$v$$: $$\texttt{v = Av/norm(Av);}$$ and go back to step $$3$$.

Once this algorithm terminates with $$\texttt{done}$$, then $$\texttt{la}$$ is (approximately) an eigenvalue and $$\texttt{v}$$ is a normalized eigenvector.

Notes: If the algorithm takes too long, you should start over with a new random vector.

If $$A$$ has two opposite eigenvalues, then the algorithm may not work. So, we start by adding $$\texttt{c*eye(n)}$$ to our matrix $$A$$ where $$c$$ is a random number, then subtract $$c$$ from the eigenvalue we get at the end.

By John Jasper

• 366