Triangular matrices, similar matrices, diagonalizable matrices.
Now, given a matrix \(A\) we have two new problems:
Now we have to work on the harder problem
Example. One type of matrix where the eigenvalues are easy to identify are diagonal matrices, for example
\[\Lambda = \begin{bmatrix} -1 & 0 & 0\\ 0 & 2 & 0\\ 0 & 0 & 5\end{bmatrix}\]
The standard basis vectors in \(\mathbb{R}^{3}\) are all eigenvectors of \(\Lambda\), and the eigenvalues of \(\Lambda\) are \(-1,2,\) and \(5\).
Theorem. Let \(A\) be a square matrix. The following are equivalent:
Example. Assume there is a number \(\lambda\in\R\) such that
\[\begin{bmatrix}1 & -1\\ 1 & 1\end{bmatrix} - \lambda\begin{bmatrix}1 & 0\\ 0 & 1\end{bmatrix} = \begin{bmatrix} 1-\lambda & -1\\ 1 & 1-\lambda\end{bmatrix}\]
has dependent columns. Then there is a constant \(c\) such that
\[c\begin{bmatrix} 1-\lambda\\1\end{bmatrix} = \begin{bmatrix} -1\\ 1-\lambda \end{bmatrix}\]
This means \(c=1-\lambda\), and hence \((1-\lambda)^2=-1\), but this is false.
This matrix has no eigenvalues!
Notice: All triangular matrices are SQUARE!
Definition. An \(n\times n\) matrix \(A\) is lower triangular if for every \(i,j\) such that \(1\leq i<j\leq n\) the entry of \(A\) in row \(i\) column \(j\) is zero.
A square matrix \(A\) is upper triangular if \(A^{\top}\) is lower triangular.
Examples.
\(\begin{bmatrix}1 & 0 & 0\\ 1 & 2 & 0\\ 1 & -2 & 0.2\end{bmatrix}\) is lower triangular
\(\begin{bmatrix}1 & 1 & -1\\ 0 & 1 & 0\\ 0 & 0 & -0.2\end{bmatrix}\) is upper triangular
\(\begin{bmatrix}2 & -3 & 0\\ 0 & 0 & 0\\ 0 & 0 & -0.2\end{bmatrix}\) is upper triangular.
\(\begin{bmatrix} 0 & 1\\ 0 & 1\end{bmatrix}\) is upper triangular.
\(\begin{bmatrix}1 & 0 & 0\\ 0 & -1 & 0\\ 0 & 0 & 0\end{bmatrix}\) is upper triangular and lower triangular.
\(\begin{bmatrix} 3 & 2 & 0 & 0\\ 3 & 1 & 4 & 0 \\ 0 & -2 & 3 & 1\\ 0 & 0 & 1 & -2\end{bmatrix}\) is not upper triangular or lower triangular.
Definition. If a square matrix \(A\) is both upper triangular and lower triangular, then we say that \(A\) is a diagonal matrix.
A square matrix is diagonal if and only if the only nonzero entries in the matrix are on the main diagonal. The zero matrix and the identity matrix are two examples of diagonal matrices.
Diagonal matrix
Theorem. A triangular matrix \(A\) is invertible if and only if all of the entries on the main diagonal are nonzero.
Proof.
Note that an invertible upper triangular matrix has the form
\[\begin{bmatrix}d_{1} & \ast & \ast & \cdots & \ast\\ 0 & d_{2} & \ast & \cdots & \ast\\ 0 & 0 & d_{3} & \cdots & \ast\\[-1ex] \vdots & \vdots & & \ddots & \vdots\\ 0 & 0 & \cdots & \cdots & d_{n}\end{bmatrix}\]
Where all of the \(d_{i}\)'s are nonzero. We can create a pivot in each row!
Theorem. If \(A\) is a triangular matrix, then the eigenvalues of \(A\) are the entries on the main diagonal of \(A\).
Proof. We will assume \(A\) is uppertriangular. The proof for lowertriangular matrices is similar. If \(\lambda\) is an entry on the diagonal of \(A\), then \(A-\lambda I\) is uppertriangular, and \(A-\lambda I\) has a zero on the diagonal. Therefore \(A-\lambda I\) is not invertible. \(\Box\)
Example. Consider the matrix \(A = \begin{bmatrix} 1 & 1 & 2\\ 0 & 2 & 0\\ 0 & 0 & 3\end{bmatrix}\)
Note that \(A-2I = \begin{bmatrix} -1 & 1 & 2\\ 0 & 0 & 0\\ 0 & 0 & 1\end{bmatrix}\), and \(\operatorname{rref}(A-2I) = \begin{bmatrix} -1 & 1 & 0\\ 0 & 0 & 1\\ 0 & 0 & 0\end{bmatrix}\). From this we can see that \[x=[1\ \ 1\ \ 0]^{\top}\in N(\operatorname{rref}(A-2I))=N(A-2I),\] and thus \(x\) is an eigenvector of \(A\) with eigenvalue \(2\).
Definition. Two matrices \(A\) and \(B\) are called similar if there is an invertible matrix \(X\) such that
\[A=XBX^{-1}.\]
(Note that this definition implies that \(A\) and \(B\) are both square and the same size.)
Theorem. If \(A\) and \(B\) are similar, then they have the exact same set of eigenvalues.
Proof. Assume \(A=XBX^{-1}\) and \(\lambda\) is an eigenvalue of \(B\). This means that \(Bx=\lambda x\) for some nonzero \(x\) and some scalar \(\lambda\). Set \(y=Xx\). Since \(X\) in invertible, \(y\) is not the zero vector, and
\[Ay = XBX^{-1}Xx = XBx=X\lambda x = \lambda Xx = \lambda y\]
hence \(\lambda\) is an eigenvalue of \(A\). Since \(B=X^{-1}AX\), the other direction is similar. \(\Box\)
Example. Consider the diagonal matrix
\[A = \begin{bmatrix} 1 & 0 & 0\\ 0 & -2 & 0\\ 0 & 0 & 3\end{bmatrix},\]
and the invertible matrix
\[X = \begin{bmatrix} 1 & 2 & 0\\ 0 & 1 & -1\\ 0 & 0 & 1\end{bmatrix}\quad\text{with inverse }\quad X^{-1} = \begin{bmatrix} 1 & -2 & -2\\ 0 & 1 & 1\\ 0 & 0 & 1 \end{bmatrix}.\]
The matrix \[XAX^{-1}=\begin{bmatrix} 1 & 2 & 0\\ 0 & 1 & -1\\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix} 1 & 0 & 0\\ 0 & -2 & 0\\ 0 & 0 & 3\end{bmatrix}\begin{bmatrix} 1 & -2 & -2\\ 0 & 1 & 1\\ 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 1 & -6 & -6\\ 0 & -2 & -5\\ 0 & 0 & 3 \end{bmatrix},\]
is similar to \(A\).
Example continued. The eigenvalues and eigenvectors of \(A\) are obvious. The eigenvectors of \(XAX^{-1}\) are not.
\[XAX^{-1} = \begin{bmatrix} 1 & -6 & -6\\ 0 & -2 & -5\\ 0 & 0 & 3 \end{bmatrix},\]
\[A = \begin{bmatrix} 1 & 0 & 0\\ 0 & -2 & 0\\ 0 & 0 & 3\end{bmatrix},\]
Eigenvalue \(\lambda\)
Eigenspace \(N(A-\lambda I)\)
\(\text{span}\left\{\begin{bmatrix} 1\\ 0\\ 0\end{bmatrix}\right\}\)
\(\text{span}\left\{\begin{bmatrix} 0\\ 1\\ 0\end{bmatrix}\right\}\)
\(\text{span}\left\{\begin{bmatrix} 0\\ 0\\ 1\end{bmatrix}\right\}\)
\(1\)
\(-2\)
\(3\)
Eigenvalue \(\lambda\)
Eigenspace \(N(XAX^{-1}-\lambda I)\)
\(\text{span}\left\{\begin{bmatrix} 1\\ 0\\ 0\end{bmatrix}\right\}\)
\(\text{span}\left\{\begin{bmatrix} 2\\ 1\\ 0\end{bmatrix}\right\}\)
\(\text{span}\left\{\begin{bmatrix} 0\\ -1\\ 1\end{bmatrix}\right\}\)
\(1\)
\(-2\)
\(3\)
Definition. A matrix \(A\) is called diagonalizable if it is similar to a diagonal matrix.
Example. Take the diagonal matrix
\[A = \begin{bmatrix} 2 & 0 & 0 & 0\\ 0 & 3 & 0 & 0\\ 0 & 0 & -1 & 0\\ 0 & 0 & 0 & 0\end{bmatrix}\]
Set
\[B = \begin{bmatrix}-3 & -2 & 0 & 4\\ 1 & 2 & -2 & -3\\ -1 & -2 & 3 & 3\\ -3 & -3 & -1 & 5\end{bmatrix}\begin{bmatrix} 2 & 0 & 0 & 0\\ 0 & 3 & 0 & 0\\ 0 & 0 & -1 & 0\\ 0 & 0 & 0 & 0\end{bmatrix}\begin{bmatrix} 1 & -8 & -6 & -2\\ 4 & -14 & -11 & -5\\ 0 & 1 & 1 & 0\\ 3 & -13 & -10 & -4\end{bmatrix}\]
\[= \left[\begin{array}{rrrr}-30 & 132 & 102 & 42\\ 26 & -98 & -76 & -34\\ -26 & 97 & 75 & 34\\ -42 & 175 & 136 & 57\end{array}\right]\]
So, \(B\) is diagonalizable
Theorem. If \(A\) is an \(n\times n\) diagonalizable matrix, that is,
\[A = X\Lambda X^{-1}\]
for some diagonal matrix \(\Lambda\), then both of the following are true
Proof.
Theorem. If \(A\) is an \(n\times n\) matrix and \(\{v_{1},\ldots,v_{n}\}\) is a basis of eigenvectors of \(A\) with corresponding eigenvalues \(\{\lambda_{1},\ldots,\lambda_{n}\}\), then
\[A = X\Lambda X^{-1}\]
where
\[X = \begin{bmatrix} | & | & | & & |\\ v_{1} & v_{2} & v_{3} & \cdots & v_{n}\\ | & | & | & & |\end{bmatrix}\]
and
\[\Lambda = \text{diag}(\lambda_{1},\lambda_{2},\ldots,\lambda_{n}) = \begin{bmatrix} \lambda_{1} & 0 & \cdots & 0\\ 0 & \lambda_{2} & & \vdots\\ \vdots & & \ddots & 0\\ 0 & \cdots & 0 & \lambda_{n}\end{bmatrix}\]
Proof. We will show that \(Ax=X\Lambda X^{-1} x\) for an arbitrary \(x\in\R^{n}\). Since \(\{v_{1},\ldots, v_{n}\}\) is a basis, there are scalars \(\alpha_{1},\ldots,\alpha_{n}\) such that
\[x = \alpha_{1}v_{1} + \alpha_{2}v_{2} + \cdots + \alpha_{n}v_{n}\]
Multiplying by \(A\) on the left we have
\[Ax = \alpha_{1}Av_{1} + \alpha_{2}Av_{2} + \cdots + \alpha_{n}Av_{n}\]
\[ = \alpha_{1}\lambda_{1}v_{1} + \alpha_{2}\lambda_{2}v_{2} + \cdots + \alpha_{n}\lambda_{n}v_{n}\]
By the definition of \(X\) we see that \(Xe_{i} = v_{i}\), and hence \(X^{-1}v_{i} = e_{i}\) for \(i=1,\ldots,n\). Hence,
\[X\Lambda X^{-1}x = X\Lambda X^{-1}\big(\alpha_{1}v_{1} + \alpha_{2}v_{2} + \cdots + \alpha_{n}v_{n}\big)\]
\[ = X\Lambda \big(\alpha_{1}X^{-1}v_{1} + \alpha_{2}X^{-1}v_{2} + \cdots + \alpha_{n}X^{-1}v_{n}\big)\]
\[ = X\Lambda \big(\alpha_{1}e_{1} + \alpha_{2}e_{2} + \cdots + \alpha_{n}e_{n}\big)\]
\[ = X\big(\alpha_{1}\lambda_{1} e_{1} + \alpha_{2}\lambda_{2} e_{2} + \cdots + \alpha_{n}\lambda_{n}e_{n}\big)\]
\[ = \alpha_{1}\lambda_{1} v_{1} + \alpha_{2}\lambda_{2} v_{2} + \cdots + \alpha_{n}\lambda_{n}v_{n}.\]
\(\Box\)
Example. Consider the matrix
\[A=\left[\begin{array}{rrr} 2 & -1 & 2\\ 0 & 3 & -2\\ 0 & 0 & -1\end{array}\right]\]
Since \(A\) is upper triangular, the eigenvalues of \(A\) are the entries on the main diagonal. Thus, the eigenvalues of \(A\) are \(2,3,\) and \(-1\).
By the previous theorem, we need to find a basis for \(\mathbb{R}^{3}\) consisting of eigenvectors of \(A\).
Since we know all the eigenvalues of \(A\), we can find all of the eigenvectors by finding \(N(A-\lambda I)\) for \(\lambda=2,3,-1\).
\[A-(-1) I=\left[\begin{array}{rrr} 3 & -1 & 2\\ 0 & 4 & -2\\ 0 & 0 & 0\end{array}\right]\]
Row reducing we get
\[\text{rref}(A-(-1)I) = \left[\begin{array}{rrr} 1 &0 & \frac{1}{2}\\ 0 & 1 & -\frac{1}{2}\\ 0 & 0 & 0\end{array}\right]\]
Hence, we see that
\[N(A-(-1)I) = \text{span}\left\{\begin{bmatrix} -\frac{1}{2}\\ \frac{1}{2}\\ 1\end{bmatrix}\right\}\]
Similarly
\[N(A-3I) = \text{span}\left\{\begin{bmatrix} -1\\ 1\\ 0\end{bmatrix}\right\}\quad\text{and }\quad N(A-2I) = \text{span}\left\{\begin{bmatrix} 1\\ 0\\ 0\end{bmatrix}\right\}\]
Selecting a nonzero vector from each eigenspace and forming the matrix with those columns:
\[X=\begin{bmatrix} 1 & -1 & -\frac{1}{2}\\ 0 & 1 & \frac{1}{2}\\ 0 & 0 & 1\end{bmatrix}\]
Since the columns of \(X\) form a basis for \(\R^{3}\), we see that \(X\) is invertible. Indeed,
\[X^{-1} = \begin{bmatrix} 1 & 1 & 0\\ 0 & 1 & -\frac{1}{2}\\ 0 & 0 & 1\end{bmatrix}.\]
Finally,
\[\begin{bmatrix} 1 & -1 & -\frac{1}{2}\\ 0 & 1 & \frac{1}{2}\\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix} 2 & 0 & 0\\ 0 & 3 & 0\\ 0 & 0 & -1\end{bmatrix}\begin{bmatrix} 1 & 1 & 0\\ 0 & 1 & -\frac{1}{2}\\ 0 & 0 & 1\end{bmatrix} = \left[\begin{array}{rrr} 2 & -1 & 2\\ 0 & 3 & -2\\ 0 & 0 & -1\end{array}\right]\]
Hence, \(A\) is diagonalizable!
\(X\)
\(\Lambda\)
\(X^{-1}\)
\(A\)
Take \[J=\begin{bmatrix} 1 & 1\\ 0 & 1\end{bmatrix}.\]
Since \(J\) is upper triangular, we see that \(1\) is the only eigenvalue of \(J\).
Next we compute \(\operatorname{rref}(J-I) = \begin{bmatrix} 0 & 1\\ 0 & 0\end{bmatrix}\), and from this we see that
\[N(J-I) = \operatorname{span}\left\{\begin{bmatrix} 1\\ 0\end{bmatrix}\right\}.\]
That is, the eigenvectors of \(J\) are exactly \([a\ \ 0]^{\top}\) for \(a\neq 0\). No two of these are linearly independent, and hence there is no basis for \(\mathbb{R}^{2}\) consisting of eigenvectors of \(J\).
Hence, \(J\) is not diagonalizable.