Day 21:

Triangular matrices, similar matrices, diagonalizable matrices.

Two new problems

Now, given a matrix \(A\) we have two new problems:

Find all of the eigenvalues of \(A\).
Given an eigenvalue \(\lambda\), find all eigenvectors of \(A\) with associated eigenvalue \(\lambda\).

Now we have to work on the harder problem

Example. One type of matrix where the eigenvalues are easy to identify are diagonal matrices, for example

\[\Lambda = \begin{bmatrix} -1 & 0 & 0\\ 0 & 2 & 0\\ 0 & 0 & 5\end{bmatrix}\]

The standard basis vectors in \(\mathbb{R}^{3}\) are all eigenvectors of \(\Lambda\), and the eigenvalues of \(\Lambda\) are \(-1,2,\) and \(5\).

Finding eigenvalues

Theorem. Let \(A\) be a square matrix. The following are equivalent:

\(\lambda\) is an eigenvalue of \(A\)
\((A-\lambda I)x=0\) has a nontrivial solution
\(N(A-\lambda I)\neq \{0\}\)
The columns of \(A-\lambda I\) are dependent.
\(A-\lambda I\) is not invertible.

Example. Assume there is a number \(\lambda\in\R\) such that

\[\begin{bmatrix}1 & -1\\ 1 & 1\end{bmatrix} - \lambda\begin{bmatrix}1 & 0\\ 0 & 1\end{bmatrix} = \begin{bmatrix} 1-\lambda & -1\\ 1 & 1-\lambda\end{bmatrix}\]

has dependent columns. Then there is a constant \(c\) such that

\[c\begin{bmatrix} 1-\lambda\\1\end{bmatrix} = \begin{bmatrix} -1\\ 1-\lambda \end{bmatrix}\]

This means \(c=1-\lambda\), and hence \((1-\lambda)^2=-1\), but this is false.

This matrix has no eigenvalues!

Notice: All triangular matrices are SQUARE!

Definition. An \(n\times n\) matrix \(A\) is lower triangular if for every \(i,j\) such that \(1\leq i<j\leq n\) the entry of \(A\) in row \(i\) column \(j\) is zero.

A square matrix \(A\) is upper triangular if \(A^{\top}\) is lower triangular.

Triangular Matrices

Examples.

\(\begin{bmatrix}1 & 0 & 0\\ 1 & 2 & 0\\ 1 & -2 & 0.2\end{bmatrix}\) is lower triangular

\(\begin{bmatrix}1 & 1 & -1\\ 0 & 1 & 0\\ 0 & 0 & -0.2\end{bmatrix}\) is upper triangular

\(\begin{bmatrix}2 & -3 & 0\\ 0 & 0 & 0\\ 0 & 0 & -0.2\end{bmatrix}\) is upper triangular.

\(\begin{bmatrix} 0 & 1\\ 0 & 1\end{bmatrix}\) is upper triangular.

\(\begin{bmatrix}1 & 0 & 0\\ 0 & -1 & 0\\ 0 & 0 & 0\end{bmatrix}\) is upper triangular and lower triangular.

\(\begin{bmatrix} 3 & 2 & 0 & 0\\ 3 & 1 & 4 & 0 \\ 0 & -2 & 3 & 1\\ 0 & 0 & 1 & -2\end{bmatrix}\) is not upper triangular or lower triangular.

Definition. If a square matrix \(A\) is both upper triangular and lower triangular, then we say that \(A\) is a diagonal matrix.

A square matrix is diagonal if and only if the only nonzero entries in the matrix are on the main diagonal. The zero matrix and the identity matrix are two examples of diagonal matrices.

Diagonal matrix

Theorem. A triangular matrix \(A\) is invertible if and only if all of the entries on the main diagonal are nonzero.

Proof.

A square matrix \(A\) is invertible if and only if \(\text{rref}(A)=I\).
If all of the entries on the diagonal of \(A\) are nonzero, then there will be one pivot in each row, hence \(\text{rref}(A)=I\)
If one entry on the diagonal is zero, then the corresponding column of \(\text{rref}(A)\) does not have a pivot, hence \(\text{rref}(A)\neq I\). \(\Box\)

Note that an invertible upper triangular matrix has the form

\[\begin{bmatrix}d_{1} & \ast & \ast & \cdots & \ast\\ 0 & d_{2} & \ast & \cdots & \ast\\ 0 & 0 & d_{3} & \cdots & \ast\\[-1ex] \vdots & \vdots & & \ddots & \vdots\\ 0 & 0 & \cdots & \cdots & d_{n}\end{bmatrix}\]

Where all of the \(d_{i}\)'s are nonzero. We can create a pivot in each row!

Theorem. If \(A\) is a triangular matrix, then the eigenvalues of \(A\) are the entries on the main diagonal of \(A\).

Proof. We will assume \(A\) is uppertriangular. The proof for lowertriangular matrices is similar. If \(\lambda\) is an entry on the diagonal of \(A\), then \(A-\lambda I\) is uppertriangular, and \(A-\lambda I\) has a zero on the diagonal. Therefore \(A-\lambda I\) is not invertible. \(\Box\)

Example. Consider the matrix \(A = \begin{bmatrix} 1 & 1 & 2\\ 0 & 2 & 0\\ 0 & 0 & 3\end{bmatrix}\)

Note that \(A-2I = \begin{bmatrix} -1 & 1 & 2\\ 0 & 0 & 0\\ 0 & 0 & 1\end{bmatrix}\), and \(\operatorname{rref}(A-2I) = \begin{bmatrix} -1 & 1 & 0\\ 0 & 0 & 1\\ 0 & 0 & 0\end{bmatrix}\). From this we can see that \[x=[1\ \ 1\ \ 0]^{\top}\in N(\operatorname{rref}(A-2I))=N(A-2I),\] and thus \(x\) is an eigenvector of \(A\) with eigenvalue \(2\).

Similar matrices

Definition. Two matrices \(A\) and \(B\) are called similar if there is an invertible matrix \(X\) such that

\[A=XBX^{-1}.\]

(Note that this definition implies that \(A\) and \(B\) are both square and the same size.)

Theorem. If \(A\) and \(B\) are similar, then they have the exact same set of eigenvalues.

Proof. Assume \(A=XBX^{-1}\) and \(\lambda\) is an eigenvalue of \(B\). This means that \(Bx=\lambda x\) for some nonzero \(x\) and some scalar \(\lambda\). Set \(y=Xx\). Since \(X\) in invertible, \(y\) is not the zero vector, and

\[Ay = XBX^{-1}Xx = XBx=X\lambda x = \lambda Xx = \lambda y\]

hence \(\lambda\) is an eigenvalue of \(A\). Since \(B=X^{-1}AX\), the other direction is similar. \(\Box\)

Similar matrices

Example. Consider the diagonal matrix

\[A = \begin{bmatrix} 1 & 0 & 0\\ 0 & -2 & 0\\ 0 & 0 & 3\end{bmatrix},\]

and the invertible matrix

\[X = \begin{bmatrix} 1 & 2 & 0\\ 0 & 1 & -1\\ 0 & 0 & 1\end{bmatrix}\quad\text{with inverse }\quad X^{-1} = \begin{bmatrix} 1 & -2 & -2\\ 0 & 1 & 1\\ 0 & 0 & 1 \end{bmatrix}.\]

The matrix \[XAX^{-1}=\begin{bmatrix} 1 & 2 & 0\\ 0 & 1 & -1\\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix} 1 & 0 & 0\\ 0 & -2 & 0\\ 0 & 0 & 3\end{bmatrix}\begin{bmatrix} 1 & -2 & -2\\ 0 & 1 & 1\\ 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 1 & -6 & -6\\ 0 & -2 & -5\\ 0 & 0 & 3 \end{bmatrix},\]

is similar to \(A\).

Similar matrices

Example continued. The eigenvalues and eigenvectors of \(A\) are obvious. The eigenvectors of \(XAX^{-1}\) are not.

\[XAX^{-1} = \begin{bmatrix} 1 & -6 & -6\\ 0 & -2 & -5\\ 0 & 0 & 3 \end{bmatrix},\]

\[A = \begin{bmatrix} 1 & 0 & 0\\ 0 & -2 & 0\\ 0 & 0 & 3\end{bmatrix},\]

Eigenvalue \(\lambda\)

Eigenspace \(N(A-\lambda I)\)

\(\text{span}\left\{\begin{bmatrix} 1\\ 0\\ 0\end{bmatrix}\right\}\)

\(\text{span}\left\{\begin{bmatrix} 0\\ 1\\ 0\end{bmatrix}\right\}\)

\(\text{span}\left\{\begin{bmatrix} 0\\ 0\\ 1\end{bmatrix}\right\}\)

\(1\)

\(-2\)

\(3\)

Eigenvalue \(\lambda\)

Eigenspace \(N(XAX^{-1}-\lambda I)\)

\(\text{span}\left\{\begin{bmatrix} 1\\ 0\\ 0\end{bmatrix}\right\}\)

\(\text{span}\left\{\begin{bmatrix} 2\\ 1\\ 0\end{bmatrix}\right\}\)

\(\text{span}\left\{\begin{bmatrix} 0\\ -1\\ 1\end{bmatrix}\right\}\)

\(1\)

\(-2\)

\(3\)

Diagonalizable matrices

Definition. A matrix \(A\) is called diagonalizable if it is similar to a diagonal matrix.

Example. Take the diagonal matrix

\[A = \begin{bmatrix} 2 & 0 & 0 & 0\\ 0 & 3 & 0 & 0\\ 0 & 0 & -1 & 0\\ 0 & 0 & 0 & 0\end{bmatrix}\]

Set

\[B = \begin{bmatrix}-3 & -2 & 0 & 4\\ 1 & 2 & -2 & -3\\ -1 & -2 & 3 & 3\\ -3 & -3 & -1 & 5\end{bmatrix}\begin{bmatrix} 2 & 0 & 0 & 0\\ 0 & 3 & 0 & 0\\ 0 & 0 & -1 & 0\\ 0 & 0 & 0 & 0\end{bmatrix}\begin{bmatrix} 1 & -8 & -6 & -2\\ 4 & -14 & -11 & -5\\ 0 & 1 & 1 & 0\\ 3 & -13 & -10 & -4\end{bmatrix}\]

\[= \left[\begin{array}{rrrr}-30 & 132 & 102 & 42\\ 26 & -98 & -76 & -34\\ -26 & 97 & 75 & 34\\ -42 & 175 & 136 & 57\end{array}\right]\]

So, \(B\) is diagonalizable

Diagonalizable matrices

Theorem. If \(A\) is an \(n\times n\) diagonalizable matrix, that is,

\[A = X\Lambda X^{-1}\]

for some diagonal matrix \(\Lambda\), then both of the following are true

the entries on the diagonal of \(\Lambda\) are the eigenvalues of \(A\)
the columns of \(X\) are a basis for \(\R^{n}\) consisting of eigenvectors of \(A\).

Proof.

Let \(e_{i}\) be the \(i\)th standard basis vector. That is, \(e_{i}\) is an \(n\times 1\) vector with \(1\) in the \(i\)th entry and zeros elsewhere.
Note that \(Xe_{i}\) is the \(i\)th column of \(X\).
If \(\lambda_{i}\) is the \(i\)th entry on the diagonal of \(\Lambda\), then \(\Lambda e_{i} = \lambda_{i}e_{i}\).
\(A(Xe_{i}) = X\Lambda X^{-1}Xe_{i} = X\Lambda e_{i} = X(\lambda_{i}e_{i})=\lambda_{i}Xe_{i}\) \(\Box\)

Diagonalizable matrices

Theorem. If \(A\) is an \(n\times n\) matrix and \(\{v_{1},\ldots,v_{n}\}\) is a basis of eigenvectors of \(A\) with corresponding eigenvalues \(\{\lambda_{1},\ldots,\lambda_{n}\}\), then

\[A = X\Lambda X^{-1}\]

where

\[X = \begin{bmatrix} | & | & | & & |\\ v_{1} & v_{2} & v_{3} & \cdots & v_{n}\\ | & | & | & & |\end{bmatrix}\]

and

\[\Lambda = \text{diag}(\lambda_{1},\lambda_{2},\ldots,\lambda_{n}) = \begin{bmatrix} \lambda_{1} & 0 & \cdots & 0\\ 0 & \lambda_{2} & & \vdots\\ \vdots & & \ddots & 0\\ 0 & \cdots & 0 & \lambda_{n}\end{bmatrix}\]

Diagonalizable matrices

Proof. We will show that \(Ax=X\Lambda X^{-1} x\) for an arbitrary \(x\in\R^{n}\). Since \(\{v_{1},\ldots, v_{n}\}\) is a basis, there are scalars \(\alpha_{1},\ldots,\alpha_{n}\) such that

\[x = \alpha_{1}v_{1} + \alpha_{2}v_{2} + \cdots + \alpha_{n}v_{n}\]

Multiplying by \(A\) on the left we have

\[Ax = \alpha_{1}Av_{1} + \alpha_{2}Av_{2} + \cdots + \alpha_{n}Av_{n}\]

\[ = \alpha_{1}\lambda_{1}v_{1} + \alpha_{2}\lambda_{2}v_{2} + \cdots + \alpha_{n}\lambda_{n}v_{n}\]

By the definition of \(X\) we see that \(Xe_{i} = v_{i}\), and hence \(X^{-1}v_{i} = e_{i}\) for \(i=1,\ldots,n\). Hence,

\[X\Lambda X^{-1}x = X\Lambda X^{-1}\big(\alpha_{1}v_{1} + \alpha_{2}v_{2} + \cdots + \alpha_{n}v_{n}\big)\]

\[ = X\Lambda \big(\alpha_{1}X^{-1}v_{1} + \alpha_{2}X^{-1}v_{2} + \cdots + \alpha_{n}X^{-1}v_{n}\big)\]

\[ = X\Lambda \big(\alpha_{1}e_{1} + \alpha_{2}e_{2} + \cdots + \alpha_{n}e_{n}\big)\]

\[ = X\big(\alpha_{1}\lambda_{1} e_{1} + \alpha_{2}\lambda_{2} e_{2} + \cdots + \alpha_{n}\lambda_{n}e_{n}\big)\]

\[ = \alpha_{1}\lambda_{1} v_{1} + \alpha_{2}\lambda_{2} v_{2} + \cdots + \alpha_{n}\lambda_{n}v_{n}.\]

\(\Box\)

Diagonalizable matrices

Example. Consider the matrix

\[A=\left[\begin{array}{rrr} 2 & -1 & 2\\ 0 & 3 & -2\\ 0 & 0 & -1\end{array}\right]\]

Since \(A\) is upper triangular, the eigenvalues of \(A\) are the entries on the main diagonal. Thus, the eigenvalues of \(A\) are \(2,3,\) and \(-1\).

By the previous theorem, we need to find a basis for \(\mathbb{R}^{3}\) consisting of eigenvectors of \(A\).

Since we know all the eigenvalues of \(A\), we can find all of the eigenvectors by finding \(N(A-\lambda I)\) for \(\lambda=2,3,-1\).

Diagonalizable matrices

\[A-(-1) I=\left[\begin{array}{rrr} 3 & -1 & 2\\ 0 & 4 & -2\\ 0 & 0 & 0\end{array}\right]\]

Row reducing we get

\[\text{rref}(A-(-1)I) = \left[\begin{array}{rrr} 1 &0 & \frac{1}{2}\\ 0 & 1 & -\frac{1}{2}\\ 0 & 0 & 0\end{array}\right]\]

Hence, we see that

\[N(A-(-1)I) = \text{span}\left\{\begin{bmatrix} -\frac{1}{2}\\ \frac{1}{2}\\ 1\end{bmatrix}\right\}\]

Similarly

\[N(A-3I) = \text{span}\left\{\begin{bmatrix} -1\\ 1\\ 0\end{bmatrix}\right\}\quad\text{and }\quad N(A-2I) = \text{span}\left\{\begin{bmatrix} 1\\ 0\\ 0\end{bmatrix}\right\}\]

Diagonalizable matrices

Selecting a nonzero vector from each eigenspace and forming the matrix with those columns:

\[X=\begin{bmatrix} 1 & -1 & -\frac{1}{2}\\ 0 & 1 & \frac{1}{2}\\ 0 & 0 & 1\end{bmatrix}\]

Since the columns of \(X\) form a basis for \(\R^{3}\), we see that \(X\) is invertible. Indeed,

\[X^{-1} = \begin{bmatrix} 1 & 1 & 0\\ 0 & 1 & -\frac{1}{2}\\ 0 & 0 & 1\end{bmatrix}.\]

Finally,

\[\begin{bmatrix} 1 & -1 & -\frac{1}{2}\\ 0 & 1 & \frac{1}{2}\\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix} 2 & 0 & 0\\ 0 & 3 & 0\\ 0 & 0 & -1\end{bmatrix}\begin{bmatrix} 1 & 1 & 0\\ 0 & 1 & -\frac{1}{2}\\ 0 & 0 & 1\end{bmatrix} = \left[\begin{array}{rrr} 2 & -1 & 2\\ 0 & 3 & -2\\ 0 & 0 & -1\end{array}\right]\]

Hence, \(A\) is diagonalizable!

\(X\)

\(\Lambda\)

\(X^{-1}\)

\(A\)

A matrix that is not diagonalizable

Take \[J=\begin{bmatrix} 1 & 1\\ 0 & 1\end{bmatrix}.\]

Since \(J\) is upper triangular, we see that \(1\) is the only eigenvalue of \(J\).

Next we compute \(\operatorname{rref}(J-I) = \begin{bmatrix} 0 & 1\\ 0 & 0\end{bmatrix}\), and from this we see that

\[N(J-I) = \operatorname{span}\left\{\begin{bmatrix} 1\\ 0\end{bmatrix}\right\}.\]

That is, the eigenvectors of \(J\) are exactly \([a\ \ 0]^{\top}\) for \(a\neq 0\). No two of these are linearly independent, and hence there is no basis for \(\mathbb{R}^{2}\) consisting of eigenvectors of \(J\).

Hence, \(J\) is not diagonalizable.