Inverse of a matrix
Abstract vector spaces
Theorem. Let \(A\) be an \(m\times n\) matrix. Let \(I\) be an \(m\times m\) identity matrix. If
\[\text{rref}([A\ |\ I]) = [D\ |\ B]\]
then \(BA=\text{rref}(A)\).
Proof. Note that \[\text{rref}([A\ |\ I]) = [\text{rref}(A)\ |\ B].\] Assume \(C\) is the matrix such that
\[C\cdot [A\ \vert\ I] = \text{rref}([A\ \vert\ I])= [\text{rref}(A)\ \vert\ B].\]
Finally, note that \[C\cdot [A\ \vert\ I] = [CA\ \vert\ CI] = [CA\ \vert\ C],\] and hence \(C=B\) and \(BA=\text{rref}(A)\). \(\Box\)
Example.
\[\text{rref}\left(\left[\begin{array}{rrr|rr} 2 & 3 & 1 & 1 & 0\\ 2 & 3 & -2 & 0 & 1 \end{array}\right]\right) = \left[\begin{array}{rrr|rr} 1 & 3/2 & 0 & 1/3 & 1/6\\ 0 & 0 & 1 & 1/3 & -1/3 \end{array}\right]\]
\[\left[\begin{array}{rrr} 1/3 & 1/6\\ 1/3 & -1/3 \end{array}\right]\left[\begin{array}{rrr} 2 & 3 & 1\\ 2 & 3 & -2 \end{array}\right] = \left[\begin{array}{rrr} 1 & 3/2 & 0\\ 0 & 0 & 1\end{array}\right]\]
Definition. Given a square matrix \(A\), a square matrix \(B\) such that \(AB=BA=I\) is called the inverse of \(A\). The inverse of \(A\) is denoted \(A^{-1}\). If a matrix \(A\) has an inverse, then we say that \(A\) is invertible.
Examples.
Theorem. A matrix \(A\) is invertible if and only if \(A\) is square and \(\text{rref}(A)=I\).
Proof. Suppose \(\text{rref}(A)=I\). There exists a matrix \(B\) so that \(BA=\text{rref}(A)\). From this we deduce that \(BA=I\). For each elementary matrix \(E\) there is an elementary matrix \(F\) such that \(FE=I\). Since \(B\) is a product of elementary matrices \[B= E_{k}\cdot E_{k-1}\cdots E_{1}\]
We take \(F_{i}\) such that \(F_{i}E_{i}=I\) for each \(i\), and set \[C = F_{1}\cdot F_{2}\cdots F_{k}\]
and we see that \(CB=I\). Finally, we have \[AB = CBAB =C I B = CB = I.\]
Now, suppose \(A\) is invertible. By definition \(A\) is square. If \(x\) is a vector such that \(Ax=0\), then \(A^{-1}Ax=A^{-1}0\). Thus, \(x=0\) is the only solution to \(Ax=0\). Hence, \(\{0\}=N(A) = N(\operatorname{rref}(A))\). This implies that each row of \(\operatorname{rref}(A)\) must have a pivot. Since it is square, every column of \(\operatorname{rref}(A)\) contains a pivot. This implies \(\operatorname{rref}(A)=I\). \(\Box\)
Example. Let \[A = \begin{bmatrix} 2 & 3 & 4\\ 3 & 4 & 0\\ 5 & 7 & 4\end{bmatrix}\]
Is \(A\) invertible? If it is, find \(A^{-1}\).
Note that \[\text{rref}(A)\neq I\] and hence \(A\) is not invertible.
Note that
\[\text{rref}\left(\left[\begin{array}{ccc|ccc} 2 & 3 & 4 & 1 & 0 & 0\\ 3 & 4 & 0 & 0 & 1 & 0\\ 5 & 7 & 4 & 0 & 0 & 1\end{array}\right]\right) = \left[\begin{array}{ccc|ccc} 1 & 0 &-16 & 0 & 7 & -4\\ 0 & 1 & 12 & 0 & -5 & 3\\ 0 & 0 & 0 & 1 & 1 & -1\end{array}\right]\]
Example. Let \[A = \begin{bmatrix} 0 & -1 & -2\\ 1 & 1 & 1\\ -1 & -1 & 0\end{bmatrix}\]
Is \(A\) invertible? If it is, find \(A^{-1}\).
Since \(\text{rref}(A)= I\) we see that \(A\) is invertible. Moreover, \[A^{-1} = \begin{bmatrix} 1 & 2 & 1\\ -1 & -2 & -2\\ 0 & 1 & 1\end{bmatrix}\]
Note that
\[\text{rref}\left(\left[\begin{array}{ccc|ccc} 0 & -1 & -2 & 1 & 0 & 0\\ 1 & 1 & 1 & 0 & 1 & 0\\ -1 & -1 & 0& 0 & 0 & 1\end{array}\right]\right) = \left[\begin{array}{ccc|ccc} 1 & 0 & 0 & 1 & 2 & 1\\ 0 & 1 & 0 & -1 & -2 & -2\\ 0 & 0 & 1 & 0 & 1 & 1\end{array}\right]\]
Definition. A vector space (over \(\mathbb{R}\)) is a set \(V\) together with two operations:
and an element \(0_{V}\in V\) called the zero vector such that for each \(x,y,z\in V\) and \(a,b\in\mathbb{R}\), the following hold:
(a) \(x+y=y+x\)
(b) \((x+y)+z = x + (y+z)\)
(c) \(0x=0_{V}\)
(d) \(1x=x\)
(e) \((ab)x = a(bx)\)
(f) \(a(x+y)=ax+ay\)
(g) \((a+b)x = ax+bx\)
Definition. Let \(V\) be a vector space.
Definition continued. Let \(V\) be a vector space.
Note that if \(W\subset V\) is a subspace, then \(W\) is a vector space with the same vector addition, scalar multiplication, and zero vector as \(V\). Hence, it also makes sense to talk about a spanning set and a basis for \(W\).
Note that if \(x_{1},x_{2},\ldots,x_{k}\in V\), then the set \(\operatorname{span}\{x_{i}\}_{i=1}^{k}\) is a subspace of \(V\).
Definition. Suppose \(V\) and \(W\) are vector spaces. A function \(L:V\to W\) is called linear if for every \(x,y\in V\) and \(a\in\mathbb{R}\) the following two properties hold:
Example 1. Let \[A = \begin{bmatrix} 1 & 2 & 3\\ 0 & 1 & 0\end{bmatrix}.\] Define the function \(L:\mathbb{R}^{3}\to\mathbb{R}^{2}\) by \(L(x) = Ax.\) Using the properties of matrix multiplication we proved in Week 1, for any \(x,y\in\mathbb{R}^{3}\) and \(a\in\mathbb{R}\) we have \[L(x+y) = A(x+y) = Ax+Ay = L(x) + L(y)\] and \[L(ax) = A(ax) = aAx = aL(x).\] Thus, \(L\) is a linear function.
Example 2. Define the function \(D:\mathbb{P}_{2}\to\mathbb{P}_{1}\) by \(Df(x) = f'(x)\) for \(f(x)\in\mathbb{P}_{n}\). For example,
\[D(2x^2+x-1) = 4x+1.\]
Using some properties of the derivative from calculus, if \(f(x),g(x)\in\mathbb{P}_{2}\) and \(c\in\mathbb{R}\), then
\[D(f(x) + g(x)) = f'(x) + g'(x) = D(f(x)) + D(g(x))\]
and
\[D(cf(x)) = cf'(x) = cD(f(x)).\]
Therefore, \(D\) is linear.
Example 3. The function \(F:\mathbb{R}\to\mathbb{R}\) given by \(F(x) = 2x+1\) is not linear, since
\[F(1+1)=F(2) = 2(2)+1=5\neq 6 = F(1)+F(1).\]
Definition. Suppose \(V\) and \(W\) are vector spaces, and \(L:V\to W\) is linear. The image of \(L\) is the set
\[\operatorname{im}(L) := \{Lv : v\in V\}.\]
The kernel of \(L\) is the set
\[\operatorname{ker}(L):= \{v : L(v) = 0\}.\]
If \(A\in\mathbb{R}^{m\times n}\), and we set \(L(x) = Ax\) for \(x\in\mathbb{R}^{n}\), then \(L:\mathbb{R}^{n}\to\mathbb{R}^{m}\) is linear, moreover,
\[\operatorname{im}(L) = C(A)\quad\text{and}\quad \operatorname{ker}(L) = N(A).\]