Day 9:

Rank Nullity

Theorem. The following quantities are equal:

  1. The rank of \(A\)
  2. The number of pivots in \(\text{rref}(A)\)
  3. The number of pivots in \(\text{rref}(A^{\top})\)
  4. The rank of \(A^{\top}\)

Proof. The only thing we need to show is that \(\text{rref}(A)\) and \(\text{rref}(A^{\top})\) have the same number of pivots. But we'll actually show that \(\dim C(A^{\top})\) equals the number of pivots in \(\operatorname{rref}(A)\).

Theorem. If \(A\) is any matrix, then \(C(A^{\top}) = C(\operatorname{rref}(A)^{\top})\).

Let's recall the following useful theorem:

Example. \[A = \left[\begin{array}{rrrr} 2 & \phantom{-}4 & -2 & 8\\ 1 & 2 & 2 & -5\\ 1 & 2 & 0 & 1\end{array}\right]\]

\[\operatorname{rref}(A) = \left[\begin{array}{rrrr} 1 & \phantom{-}2 & \phantom{-}0 & 1\\ 0 & 0 & 1 & -3\\ 0 & 0 & 0 & 0\end{array}\right]\]

Since \(\left\{\begin{bmatrix} 2\\ 1\\ 1\end{bmatrix},\begin{bmatrix}-2\\ 2\\ 0\end{bmatrix}\right\}\) is a basis for \(C(A)\) we see that

\[\#\text{ pivots in }\operatorname{rref}(A) = \dim C(A)\]

Since \(C(A^{\top}) = C(\operatorname{rref}(A)^{\top})\), and hence \(\left\{\begin{bmatrix} 1\\ 2\\ 0\\ 1\end{bmatrix},\begin{bmatrix} 0\\ 0\\ 1\\ -3\end{bmatrix}\right\}\) spans \(C(A^{\top})\).

This set is also clearly independent, and hence is a basis for \(C(A^{\top})\). Hence

\[\dim(C(A^{\top}) = \#\text{ pivots in }\operatorname{rref}(A).\]

Proof. Let \(A\) be a matrix, and let \(w_{1}^{\top},w_{2}^{\top},\ldots,w_{\ell}^{\top}\) denote the nonzero rows of \(\operatorname{rref}(A)\), that is,

\[\operatorname{rref}(A) = \begin{bmatrix} - & w_{1}^{\top} & -\\ - & w_{2}^{\top} & -\\ & \vdots & \\ - & w_{\ell}^{\top} & -\\ - & 0 & -\\ & \vdots & \\ - & 0 & -\end{bmatrix}.\]

Each nonzero row contains a unique pivot, hence

\[\ell = \#\text{ pivots in }\operatorname{rref}(A).\]

Since \(C(A^{\top}) = C(\operatorname{rref}(A)^{\top})\), we deduce that \(\{w_{1},w_{2},\ldots,w_{\ell}\}\) spans \(C(A^{\top})\). However, looking at the pivots we see that is \(\{w_{1},w_{2},\ldots,w_{\ell}\}\) independent, and hence a basis for \(C(A^{\top}\), that is, \(\ell = \dim C(A^{\top})\). \(\Box\)

Theorem. The following quantities are equal:

  1. The rank of \(A\)
  2. The number of pivots in \(\text{rref}(A)\)
  3. The number of pivots in \(\text{rref}(A^{\top})\)
  4. The rank of \(A^{\top}\)

Corollary. The subspaces \(C(A)\) and \(C(A^{\top})\) have the same dimension.

 

Caution: \(C(A)\) and \(C(A^{\top})\) are almost never the same subspace. Indeed, if \(A\) is \(m\times n\), then \(C(A)\) is a subspace of \(\R^{m}\) and \(C(A^{\top})\) is a subspace of \(\R^{n}\).

Definition. Given a matrix \(A\in\mathbb{R}^{m\times n}\), the dimension of the null space of \(A\) is called the nullity of \(A\), and is denoted \(\operatorname{nullity}(A).\)

Hence, we now have at least three symbols for the same quantity:

\[\dim N(A) = \dim\operatorname{ker}(A) = \operatorname{nullity}(A).\]

We have already claimed that the nullity of \(A\) is equal to the number of non-pivot columns in \(\operatorname{rref}(A)\). Then next theorem shows that this is true.

Lemma. If \(A\in\mathbb{R}^{m\times n}\), then \(C(A^{\top}) \cap N(A) = \{0\}.\)

 

Proof. Suppose \(x\in C(A^{\top})\) and \(x\in N(A)\). By the definition of the row space there is some \(y\in\mathbb{R}^{m}\) such that \(x=A^{\top}y\). Multiplying by \(A\) on both sides we have

\[0 = Ax = AA^{\top}y.\]

Multiplying on the left by \(y^{\top}\) we obtain

\[ 0 = y^{\top} 0 = y^{\top}AA^{\top}y = (A^{\top}y)^{\top}(A^{\top}y) = x^{\top}x.\]

Now, suppose \(x = [x_{1}\ \ x_{2}\ \ \cdots\ \ x_{n}]^{\top}\). This last equality shows that

\[x_{1}^{2} + x_{2}^{2} + \cdots + x_{n}^{2} = 0,\]

which clearly implies \(x=0\). \(\Box\)

Rank Nullity

Theorem (The Rank-nullity theorem). If \(A\) is an \(m\times n\) matrix, then \[\text{rank}(A) + \operatorname{nullity}(A) = n.\]

Proof.  

  • Let W:=\(\{w_{1},w_{2},\ldots,w_{\ell}\}\) be the nonzero columns of \(\operatorname{rref}(A)^{\top}\).
  • We have already seen that \(W\) is a basis for \(C(A^{\top})\), and hence \(\ell = \dim C(A^{\top}) = \dim C(A) = \operatorname{rank}(A)\).
  • Let \(\{v_{1},\ldots,v_{k}\}\) be a basis for \(N(A)\)  (hence \(k=\operatorname{nullity}(A)\))
  • Note that \(\operatorname{rref}(A)v_{i} = 0\) for all \(i\in\{1,2,\ldots,k\}\).
  • I claim that \(\{w_{1},w_{2},\ldots,w_{\ell},v_{1},v_{2},\ldots,v_{k}\}\) is an independent set in \(\R^{n}\).
  • Suppose there are scalars \(a_{1},\ldots,a_{\ell}\) and \(b_{1},\ldots,b_{k}\) such that \[\sum_{i=1}^{\ell}a_{i}w_{i} + \sum_{j=1}^{k}b_{j}v_{j} = \boldsymbol{0}\]

Rank Nullity

Proof continued.  

  • This implies that \[\sum_{i=1}^{\ell}a_{i}w_{i} = - \sum_{j=1}^{k}b_{j}v_{j} = \begin{bmatrix}c_{1}\\ c_{2}\\ \vdots\\ c_{n}\end{bmatrix} = \boldsymbol{c}\]
  • Hence \[ \boldsymbol{c} = \sum_{i=1}^{\ell}a_{i}w_{i} \in C(A^{\top}) \quad\text{and}\quad \boldsymbol{c} = - \sum_{j=1}^{k}b_{j}v_{j} \in N(A).\]
  • By the lemma, this implies \[\mathbf{c} = \sum_{i=1}^{\ell}a_{i}w_{i} = - \sum_{j=1}^{k}b_{j}v_{j} = \boldsymbol{0}\]

Theorem (The Rank-nullity theorem). If \(A\) is an \(m\times n\) matrix, then \[\text{rank}(A) + \operatorname{nullity}(A) = n.\]

Rank Nullity

Proof continued.  

  • Since both sets \(\{w_{1},\ldots,w_{\ell}\}\) and \(\{v_{1},\ldots,v_{k}\}\) are independent, we conclude that \(a_{1}=a_{2}=\cdots =a_{\ell} = 0 = b_{1} = b_{2} = \cdots = b_{k}\)
  • \(\{w_{1},w_{2},\ldots,w_{\ell},v_{1},v_{2},\ldots,v_{k}\}\) is an independent set in \(\R^{n}\).
  • Therefore, \( k+\ell \leq n\)
  • We have already seen how to construct an independent set \(\{u_{1},\ldots,u_{r}\}\subset N(A)\) where \[r = (\#\text{ of non-pivot columns in }\text{rref}(A)) = n-\ell.\]
  • This implies \(\ k = \operatorname{nullity}(A) \geq r = n-\ell\).
  • Thus \(n\leq k+\ell\leq n\), that is \(k+\ell = n\). \(\Box\)   

Theorem (The Rank-nullity theorem). If \(A\) is an \(m\times n\) matrix, then \[\text{rank}(A) + \operatorname{nullity}(A) = n.\]

Example 2. For matrices \(A\in\mathbb{R}^{m\times n}\) such that \(\operatorname{rank}(A) = n\), it _____________ holds that \(\operatorname{nullity}(A) = 0\).

Fill in the blank with always, sometimes, or never:

always

Example 3. For matrices \(A\in\mathbb{R}^{m\times n}\) such that \(\operatorname{rank}(A) = m\) and \(n\leq m\) it _____________ holds that \(\operatorname{nullity}(A) = 0\).

always

Example 1. For matrices \(A\in\mathbb{R}^{3\times 4}\) such that \(\operatorname{rank}(A) = 3\), it _____________ holds that \(\operatorname{nullity}(A) = 1\).

always

Example 4. For matrices \(A\in\mathbb{R}^{m\times n}\) such that \(m<n\), it _____________ holds that \(\operatorname{nullity}(A) = \operatorname{nullity}(A^{\top})\).

never

Theorem 1. \(\text{rank}(AB)\leq \min\{\text{rank}(A),\text{rank}(B)\}\).

Rank theorems

Proof.  

  • Note that the columns of \(AB\) are all in \(C(A)\), hence the dimension of \(C(AB)\) is at most the dimension of \(C(A)\), that is, \[\text{rank}(AB)\leq \text{rank}(A).\]
  • \(\text{rank}(AB)=\text{rank}((AB)^{\top}) = \text{rank}(B^{\top}A^{\top})\)
  • By the first bullet \(\text{rank}(B^{\top}A^{\top})\leq \text{rank}(B^{\top}).\)
  • Since \(\text{rank}(B^{\top})=\text{rank}(B)\) we have \[\text{rank}(AB) = \text{rank}(B^{\top}A^{\top})\leq \text{rank}(B^{\top}) = \text{rank}(B).\]
  • Hence, \[\text{rank}(AB)\leq \text{rank}(A)\quad\text{ and }\quad\text{rank}(AB)\leq \text{rank}(B).\]
  • This is the same as \[\text{rank}(AB)\leq \min\{\text{rank}(A),\text{rank}(B)\}.\]

Theorem 2. \(\text{rank}(A+B)\leq \text{rank}(A) + \text{rank}(B)\).

Rank theorems

Proof.  

  • Let \(v_{1},\ldots,v_{k}\) be a basis for \(C(A)\), where \(k=\text{rank}(A)\)
  • Let \(w_{1},\ldots,w_{\ell}\) be a basis for \(C(B)\), where \(\ell=\text{rank}(B)\)
  • The columns of \(A+B\) are live in \(\text{span}\{v_{1},\ldots,v_{k},w_{1},\ldots,w_{\ell}\}\)
  • So, \(C(A+B)\subset\text{span}\{v_{1},\ldots,v_{k},w_{1},\ldots,w_{\ell}\}\) 
  • Hence \[\text{rank}(A+B)\leq k+\ell = \text{rank}(A)+\text{rank}(B)\]

Lemma. \(N(A)=N(A^{\top}A).\)

Rank theorems

Proof.  First we show that \(N(A)\subset N(A^{\top}A)\).

  • \(x\in N(A)\) \(\Rightarrow\) \(Ax=0\) \(\Rightarrow\) \(A^{\top}Ax=0\) \(\Rightarrow\) \(x\in N(A^{\top}A)\).

Next, we show the reverse inclusion \(N(A)\supset N(A^{\top}A).\)

  • Let \(x\in N(A^{\top}A)\)
  • Then \(A^{\top}Ax=0\).
  • Either \(x\in N(A)\) or \(Ax\in N(A^{\top})\).
  • But \(Ax\in C(A)\).
  • We have already shown that \(C(A)\) and \(N(A^{\top})\) are orthogonal. 
  • If \(Ax\) is also in \(N(A^{\top})\), then \((Ax)\cdot(Ax) = 0\)
  • This only happens if \(Ax=0\).

Theorem 3. \(\text{rank}(A^{\top}A)=\text{rank}(AA^{\top}) = \text{rank}(A) = \text{rank}(A^{\top})\).

Rank theorems

Proof.  

  • By the lemma \(N(A) = N(A^{\top}A)\).
  • Let \(v_{1},\ldots,v_{k}\) be a basis for \(C(A)\).
  • Then, \(A^{\top}v_{1},\ldots,A^{\top}v_{k}\) spans \(C(A^{\top}A)\).
  • We claim that this set of vectors is a basis.
  • Assume we have scalars \(\alpha_{1},\ldots,\alpha_{k}\) so that \[0 = \alpha_{1}A^{\top}v_{1} + \cdots + \alpha_{k}A^{\top}v_{k} = A^{\top}(\alpha_{1}v_{1} + \cdots + \alpha_{k}v_{k})\]
  • Set \(x = \alpha_{1}v_{1} + \cdots + \alpha_{k}v_{k}\) and note that \(x\in C(A)\).
  • There is some vector \(y\) so that \(y=Ax\).
  • \(A^{\top}Ay=0\)
  • \(A^{\top}Ay=0\) \(\Rightarrow\) \(Ay=0\)
  • \(Ay=0\) \(\Rightarrow\) \(x=0\)
  • \(v_{1},\ldots,v_{k}\) is independent \(\Rightarrow\) the \(\alpha_{i}\)'s are all zero.