Symmetric matrices
Definition. A matrix \(A\) is called symmetric if \(A=A^{\top}\).
Examples.
Example. Consider the matrix
\[A = \begin{bmatrix} 1 & -1 \\ 1 & \phantom{-}1\end{bmatrix}.\]
If we look at this as a complex matrix, then it does have eigenvalues:
\[\begin{bmatrix} 1 & -1 \\ 1 & \phantom{-}1\end{bmatrix}\begin{bmatrix} 1\\ i\end{bmatrix} = \begin{bmatrix} 1-i\\ 1+i\end{bmatrix} = (1-i)\begin{bmatrix} 1\\ i\end{bmatrix}\]
Hence \(1-i\) is a complex eigenvalue of \(A\).
Definition. Given an \(n\times n\) matrix \(A\). We say that \(\lambda\in\mathbb{C}\) is a complex eigenvalue of \(A\) if there exists a nonzero vector \(v\in\mathbb{C}^{n}\) such that \(Av=\lambda v\).
Note: If we say that \(\lambda\) is an eigenvalue of \(A\), without the word "complex" then we mean it in the previously defined sense, that is, \(\lambda\in\mathbb{R}\) and there is some nonzero vector \(v\in\mathbb{R}^{n}\) such that \(Av=\lambda v\). Note that a complex eigenvalue could be an eigenvalue.
and the adjoint of \(v\in\mathbb{C}^{1\times n}\) is the matrix \(v^{\ast} = \overline{v}^{\top}=\begin{bmatrix} \overline{v_{1}} & \overline{v_{2}} & \cdots & \overline{ v_{n}}\end{bmatrix}\)
Notation. If \(c\in\mathbb{C}\), then \(c=a+ib\) for some \(a,b\in\R\).
The complex conjugate of \(c\) is \(\overline{c} = a-ib\).
The modulus of \(c\) is \(|c| = \sqrt{c\overline{c}} = \sqrt{a^2+b^2}\).
(Note that \(c\overline{c} = |c|^2\).)
If \(v=\begin{bmatrix} v_{1}\\ v_{2}\\ \vdots\\ v_{n}\end{bmatrix}\in\mathbb{C}^{n},\) then \(\overline{v} = \begin{bmatrix} \overline{v_{1}}\\ \overline{v_{2}}\\ \vdots\\ \overline{v_{n}}\end{bmatrix}\)
Theorem. If \(A\) is an \(n\times n\) matrix, then there is a number \(\lambda\in\mathbb{C}\) and a nonzero vector \(v\in\mathbb{C}^{n}\) such that
\[Av = \lambda v.\]
That is, any square matrix has a complex eigenvalue.
Proposition 1. If \(\lambda\) is a complex eigenvalue of \(A\in\mathbb{R}^{n\times n}\), and \(\lambda\in\mathbb{R}\), then \(\lambda\) is an eigenvalue of \(A\).
Proof. By the definition of a complex eigenvalue there is some nonzero vector \(v\in\mathbb{C}^{n}\) such that \(Av=\lambda v\). If \(v\in\mathbb{R}^{n}\), then we're done, so we may assume \(v\notin\mathbb{R}^{n}\). This means that \(w=i(v-\overline{v})\) is a nonzero vector in \(\mathbb{R}^{n}\). Finally,
\[Aw=i(Aw-A\overline{w}) =i(\lambda w - \overline{Aw}) = i(\lambda w - \overline{\lambda w}) = i(\lambda w-\lambda\overline{w}) = \lambda w.\ \Box\]
Proposition 2. If \(A\) is symmetric and \(\lambda\) is a complex eigenvalue of \(A\), then \(\lambda\in\R\).
Proof. Since \(\lambda\) is a (possibly complex) eigenvalue, there is a (possibly complex) nonzero vector \(v\) such that \(Av=\lambda v.\) Since \(v\neq 0\) we can set \(w = \frac{1}{\|v\|}v\), then \(\|w\|=1\) and \(Aw = \lambda w\).
\[\lambda = \lambda\|w\|^2 = \lambda(w^{\ast} w) = w^{\ast}(\lambda w) = w^{\ast} Aw = w^{\ast}A^{\top} w = (Aw)^{\ast}w\]
\[ = (\lambda w)^{\ast}w = \overline{\lambda} w^{\ast} w = \overline{\lambda}\|w\|^2 = \overline{\lambda}\]
If \(\lambda = a+bi\), then \(\overline{\lambda} = a-bi\), and hence whe have \(a+bi = a-bi\). This implies \(2bi=0\), and thus \(b=0\). Therefore, \(\lambda\in\R\). \(\Box\)
Theorem (The spectral theorem part I). If \(A\) is an \(n\times n\) symmetric matrix, and \(A\neq 0\), then \(A\) has a nonzero eigenvalue.
Proof. If \(N(A)=\{0\}\), then \(0\) is not an eigenvalue of \(A\). By the previous theorem \(A\) has a complex eigenvalue \(\lambda\). Since \(A\) is symmetric, \(\lambda\) is real, and by the first proposition from today \(\lambda\) is an eigenvalue of \(A\).
Now, assume \(N(A)\) is nontrivial. Let \(\{v_{1},\ldots,v_{k}\}\) be an orthonormal basis for \(N(A)\). Let \(\{v_{k+1},\ldots,v_{n}\}\) be an orthonormal basis for \(N(A)^{\bot}.\) Let \(X\) be the matrix whose columns are \(v_{1},v_{2},\ldots,v_{n}\). Note that \(X\) is an orthogonal matrix, that is, \(X^{-1}=X^{\top}\). From this we can see that \(X^{-1}AX\) is symmetric, indeed,
\[(X^{-1}AX)^{\top} =(X^{\top}AX)^{\top} = (AX)^{\top}(X^{\top})^{\top} = X^{\top}A^{\top}X = X^{-1}AX.\]
\[X^{-1}AX = X^{\top}AX = X^{\top}A\begin{bmatrix} | & | & & |\\ v_{1} & v_{2} & \cdots & v_{n}\\ | & | & & |\end{bmatrix}\]
We also see that \(A'\) is an \((n-k)\times (n-k)\) symmetric matrix.
\[=\begin{bmatrix} - & v_{1}^{\top} & -\\ - & v_{2}^{\top} & -\\ & \vdots & \\ - & v_{n}^{\top} & -\\\end{bmatrix}\begin{bmatrix} | & | & & |\\ Av_{1} & Av_{2} & \cdots & Av_{n}\\ | & | & & |\end{bmatrix}\]
\[ = \begin{bmatrix} v_{1}^{\top}Av_{1} & v_{1}^{\top}Av_{2} & \cdots & v_{1}^{\top}Av_{n}\\ v_{2}^{\top}Av_{1} & v_{2}^{\top}Av_{2} & \cdots & v_{2}^{\top}Av_{n}\\ \vdots & \vdots & \ddots & \vdots\\ v_{n}^{\top}Av_{1} & v_{n}^{\top}Av_{2} & \cdots & v_{n}^{\top}Av_{n}\end{bmatrix} = \begin{bmatrix} \mathbf{0} & \mathbf{0}\\ \mathbf{0} & A'\end{bmatrix}\]
This last equality follows from the fact that \(Av_{i}=0\) for \(i=1,2,\ldots,k\), and
\[v_{i}^{\top}Av_{j} = v_{i}^{\top}A^{\top}v_{j} = (Av_{i})^{\top}v_{j}.\]
\[X^{-1}AX = \begin{bmatrix} \mathbf{0} & \mathbf{0}\\ \mathbf{0} & A' \end{bmatrix}\]
If \(v'\in N(A')\), then we set \[v =\begin{bmatrix}\mathbf{0}\\ v'\end{bmatrix},\]
and it is clear that \(X^{-1}AXv = 0\). However, by the definition of \(X\) we see that \(Xv\in N(A)^{\bot}\). In particular, either \(Xv=0\) or \(Xv\notin N(A)\). If \(Xv\notin N(A)\), then \(AXv\neq 0\), and since \(X^{-1}\) is invertible, \(X^{-1}AXv\neq 0\). Hence \(Xv=0\), and \(v=0\). This shows that the only vector in \(N(A')\) is the zero vector.
Since \(A'\) is symmetric and \(N(A')=\{0\}\), \(A'\) has a real eigenvalue \(\lambda\neq 0\) with an eigenvector \(w'\). To complete the proof, we set
\[w=X\begin{bmatrix}\mathbf{0}\\ w'\end{bmatrix}\]
and we compute
\[Aw=XX^{-1}AX\begin{bmatrix}\mathbf{0}\\ w'\end{bmatrix} = X \begin{bmatrix} \mathbf{0} & \mathbf{0}\\ \mathbf{0} & A' \end{bmatrix} \begin{bmatrix}\mathbf{0}\\ w'\end{bmatrix} = X\begin{bmatrix}\mathbf{0}\\ \lambda w'\end{bmatrix} = \lambda w.\ \Box\]
Proposition. Let \(A\) be a symmetric matrix. If \(v\) and \(w\) are eigenvectors of \(A\) with eigenvalues \(\lambda\) and \(\mu\) respectively, and \(\lambda\neq \mu\), then \(v\cdot w = 0\).
Proof. First, we calculate
\[\lambda (v^{\top}w) = v^{\top}(\lambda w) = v^{\top}Aw=v^{\top}A^{\top} w= (Av)^{\top}w=(\mu v)^{\top}w = \mu(v^{\top}w).\]
This implies
\[(\lambda - \mu)(v^{\top}w)=0\]
Since \(\lambda-\mu\neq 0\) we conclude that \(v\cdot w=v^{\top}w=0.\) \(\Box\)
Lemma. Assume \(A\) is a symmetric matrix. If \(v\) is a unit norm eigenvector of \(A\) with eigenvalue \(\lambda\neq 0\), then
\[A_{1} = A - \lambda vv^{\top}\]
is a symmetric matrix with \(\text{rank}(A_{1}) \leq \text{rank} (A)-1.\)
Proof. That \(A_{1}\) is symmetric follows from the fact that \((vv^{\top})^{\top} = (v^{\top})^{\top}v^{\top} = vv^{\top}\) and the fact that taking the transpose is linear.
Note that \[A_{1}v = Av-\lambda vv^{\top}v = \lambda v-\lambda v\|v\|^2 = \lambda v-\lambda v= 0.\] Therefore, \(v\in N(A_{1})\).
Let \(\{e_{1},e_{2},\ldots,e_{k}\}\) be an orthonormal basis for \(N(A)\). Note that every nonzero element of \(N(A)\) is an eigenvector of \(A\) with eigenvalue \(0\). Since \(\lambda\neq 0\) we have\[v\cdot e_{i} = 0\text{ for }i=1,2,\ldots,k.\] From this we deduce that each \(e_{1},\ldots,e_{k}\) is in \(N(A_{1})\). Moreover, \(\{e_{1},\ldots,e_{k},v\}\) is a linearly independent set in \(N(A_{1}).\) \(\Box\)
Example. Consider the symmetric matrix
\[A = \begin{bmatrix} -1 & -3 & 7 & 5\\ -3 & -1 & 5 & 7\\ 7 & 5 & -1 & -3\\ 5 & 7 & -3 & -1 \end{bmatrix}\]
\[\text{rref}(A-4I) = \begin{bmatrix} 1 & 0 & 0 & 1\\ 0 & 1 & 0 & -1\\ 0 & 0 & 1 & 1\\ 0 & 0 & 0 & 0\end{bmatrix}\]
From this we can see
\[v = \begin{bmatrix}-1\\ 1\\ -1\\ 1\end{bmatrix}\]
is an eigenvector of \(A\) with eigenvalue \(4\)
Example. Consider the symmetric matrix
\[A = \begin{bmatrix} -1 & -3 & 7 & 5\\ -3 & -1 & 5 & 7\\ 7 & 5 & -1 & -3\\ 5 & 7 & -3 & -1 \end{bmatrix}\]
\[\text{rref}(A-4I) = \begin{bmatrix} 1 & 0 & 0 & 1\\ 0 & 1 & 0 & -1\\ 0 & 0 & 1 & 1\\ 0 & 0 & 0 & 0\end{bmatrix}\]
From this we can see
\[v = \frac{1}{2}\begin{bmatrix}-1\\ 1\\ -1\\ 1\end{bmatrix}\]
is an eigenvector of \(A\) with eigenvalue \(4.\) In the previous theorem we need an eigenvector with norm \(1.\)
Example continued.
\[A - 4 vv^{\top}= \begin{bmatrix} -1 & -3 & 7 & 5\\ -3 & -1 & 5 & 7\\ 7 & 5 & -1 & -3\\ 5 & 7 & -3 & -1 \end{bmatrix} - \begin{bmatrix} 1 & -1 & 1 & -1\\ -1 & 1 & -1 & 1\\ 1 & -1 & 1 & -1\\ -1 & 1 & -1 & 1\end{bmatrix}\]
\[= \begin{bmatrix} -2 & -2 & 6 & 6\\ -2 & -2 & 6 & 6\\ 6 & 6 & -2 & -2\\ 6 & 6 & -2 & -2 \end{bmatrix}\]
\[\text{rref}(A) = \begin{bmatrix} 1 & 0 & 0 & -1\\ 0 & 1 & 0 & 1\\ 0 & 0 & 1 & 1\\ 0 & 0 & 0 & 0\end{bmatrix}\text{ and }\text{rref}(A-4vv^{\top}) = \begin{bmatrix} 1 & 1 & 0 & 0\\ 0 & 0 & 1 & 1\\ 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0\end{bmatrix}\]
\(\text{rank}(A) = 3\) and \(\text{rank}(A-4vv^{\top})=2\).
Example continued.
Note that \[A_{1}:=A-4vv^{\top}= \begin{bmatrix} -2 & -2 & 6 & 6\\ -2 & -2 & 6 & 6\\ 6 & 6 & -2 & -2\\ 6 & 6 & -2 & -2 \end{bmatrix}\]
is a symmetric matrix. Hence, we know that it has a nonzero eigenvalue. In particular \(8\) is an eigenvalue, since
\[\text{rref}(A_{1} - 8I) = \begin{bmatrix} 1 & 0 & 0 & -1\\ 0 & 1 & 0 & -1\\ 0 & 0 & 1 & -1\\ 0 & 0 & 0 & 0\end{bmatrix}\]
From this we see that
\[w=\frac{1}{2}\begin{bmatrix} 1 & 1 & 1 & 1\end{bmatrix}^{\top}\]
is a unit norm eigenvector of \(A_{1}\) with eigenvalue \(8.\)
Example continued.
\[A_{1} - 8 ww^{\top}= \begin{bmatrix} -2 & -2 & 6 & 6\\ -2 & -2 & 6 & 6\\ 6 & 6 & -2 & -2\\ 6 & 6 & -2 & -2 \end{bmatrix} - \begin{bmatrix} 2 & 2 & 2 & 2\\ 2 & 2 & 2 & 2\\ 2 & 2 & 2 & 2\\ 2 & 2 & 2 & 2 \end{bmatrix}\]
\[= \left[\begin{array}{rrrr} -4 & -4 & 4 & 4\\ -4 & -4 & 4 & 4\\ 4 & 4 & -4 & -4\\ 4 & 4 & -4 & -4 \end{array}\right]\]
\[A_{2}: = A - 4vv^{\top} - 8ww^{\top}= \begin{bmatrix} -4 & -4 & 4 & 4\\ -4 & -4 & 4 & 4\\ 4 & 4 & -4 & -4\\ 4 & 4 & -4 & -4 \end{bmatrix}\]
\[\text{rank}(A_{2}) = 1\]
Example continued.
\[A_{2}= \begin{bmatrix} -4 & -4 & 4 & 4\\ -4 & -4 & 4 & 4\\ 4 & 4 & -4 & -4\\ 4 & 4 & -4 & -4 \end{bmatrix}\]
Finally, \(y=\frac{1}{2}\begin{bmatrix} 1 & 1 & -1 & -1\end{bmatrix}^{\top}\) is an eigenvector of \(A_{2}\) with eigenvalue \(-16\). We see that \(A_{2} - (-16)yy^{\top}\) has rank zero, it must be the zero operator!
\[0 = A_{2} - (-16)yy^{\top} = A_{1} - 8ww^{\top} - (-16)yy^{\top}\]
and hence
\[= A - 4vv^{\top} - 8ww^{\top} - (-16)yy^{\top}\]
\[A = 4vv^{\top} + 8ww^{\top} +(-16)yy^{\top}\]