Learning Objectives

What are orthonormal vectors?

(for today's lecture)

What is an orthonormal basis?

How do you create an orthonormal basis (Gram-Schmidt process)?

What is QR factorisation?

Orthonormal vectors

q_i^\top q_j = \begin{cases} 0~~if~~i\neq j\\ 1~~if~~i = j\\ \end{cases}

q_i^\top q_j = \begin{cases} 0~~if~~i\neq j\\ 1~~if~~i = j\\ \end{cases}

Vectors $q_1, q_2, \dots, q_n$ are said to be orthonormal if

If $Q$ is a matrix whose columns are orthonormal then

Q = \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\\ q_1&q_2&\dots&q_n\\ \downarrow&\downarrow&\downarrow&\downarrow&\\ \end{bmatrix}

Q = \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\\ q_1&q_2&\dots&q_n\\ \downarrow&\downarrow&\downarrow&\downarrow&\\ \end{bmatrix}

Q^\top Q = \begin{bmatrix} \leftarrow&q_1^\top&\rightarrow\\ \leftarrow&q_2^\top&\rightarrow\\ \leftarrow&\dots&\rightarrow\\ \leftarrow&q_n^\top&\rightarrow\\ \end{bmatrix} \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\\ q_1&q_2&\dots&q_n\\ \downarrow&\downarrow&\downarrow&\downarrow&\\ \end{bmatrix}

Q^\top Q = \begin{bmatrix} \leftarrow&q_1^\top&\rightarrow\\ \leftarrow&q_2^\top&\rightarrow\\ \leftarrow&\dots&\rightarrow\\ \leftarrow&q_n^\top&\rightarrow\\ \end{bmatrix} \begin{bmatrix} \uparrow&\uparrow&\uparrow&\uparrow&\\ q_1&q_2&\dots&q_n\\ \downarrow&\downarrow&\downarrow&\downarrow&\\ \end{bmatrix}

=I

=I

Why?

because the $i,j$ -th entry of $Q^\top Q$ will be $q_i^\top q_j$

Orthogonal matrix

If $Q$ is a square matrix whose columns are orthonormal then it is called an orthogonal matrix

Q^\top Q = QQ^\top = I

Q^\top Q = QQ^\top = I

(for a square matrix the left inverse is equal to the right inverse)

Permutation

Rotation

\begin{bmatrix} 0&1&0\\ 1&0&0\\ 0&0&1\\ \end{bmatrix}

\begin{bmatrix} 0&1&0\\ 1&0&0\\ 0&0&1\\ \end{bmatrix}

\begin{bmatrix} cos \theta&-sin \theta\\ sin \theta& cos \theta \end{bmatrix}

\begin{bmatrix} cos \theta&-sin \theta\\ sin \theta& cos \theta \end{bmatrix}

\begin{bmatrix} 1 & 1\\ 1 & -1 \end{bmatrix}

\begin{bmatrix} 1 & 1\\ 1 & -1 \end{bmatrix}

\begin{bmatrix} 0&1&0\\ 1&0&0\\ 0&0&1\\ 0&0&0\\ \end{bmatrix}

\begin{bmatrix} 0&1&0\\ 1&0&0\\ 0&0&1\\ 0&0&0\\ \end{bmatrix}

Rectangular Matrix

\begin{bmatrix} 0&1&0&0\\ 1&0&0&0\\ 0&0&1&0\\ \end{bmatrix}

\begin{bmatrix} 0&1&0&0\\ 1&0&0&0\\ 0&0&1&0\\ \end{bmatrix}

QQ^\top \neq I

QQ^\top \neq I

= \begin{bmatrix} 1&0&0&0\\ 0&1&0&0\\ 0&0&1&0\\ 0&0&0&0\\ \end{bmatrix}

= \begin{bmatrix} 1&0&0&0\\ 0&1&0&0\\ 0&0&1&0\\ 0&0&0&0\\ \end{bmatrix}

(Orthonormal columns)

Q^\top Q = I

Q^\top Q = I

Q

Q

Q^\top

Q^\top

\frac{1}{\sqrt{2}}

\frac{1}{\sqrt{2}}

\begin{bmatrix} 1 & 2 & -2\\ 2 & 1 & 2\\ 2 & -2 & -1 \end{bmatrix}

\begin{bmatrix} 1 & 2 & -2\\ 2 & 1 & 2\\ 2 & -2 & -1 \end{bmatrix}

\frac{1}{3}

\frac{1}{3}

Why do we care? (about orthogonal matrices)

If $A = Q$ (a matrix with orthonormal columns)

Q^\top Q\mathbf{\hat{x}} = Q^\top\mathbf{b}

Q^\top Q\mathbf{\hat{x}} = Q^\top\mathbf{b}

\mathbf{\hat{x}} = Q^\top\mathbf{b}

\mathbf{\hat{x}} = Q^\top\mathbf{b}

\begin{bmatrix} \hat{x_1}\\ \hat{x_2}\\ \dots\\ \hat{x_n}\\ \end{bmatrix} = \begin{bmatrix} \leftarrow&\mathbf{q}_1^\top&\rightarrow\\ \leftarrow&\mathbf{q}_2^\top&\rightarrow\\ \dots\\ \leftarrow&\mathbf{q}_n^\top&\rightarrow\\ \end{bmatrix}

\begin{bmatrix} \hat{x_1}\\ \hat{x_2}\\ \dots\\ \hat{x_n}\\ \end{bmatrix} = \begin{bmatrix} \leftarrow&\mathbf{q}_1^\top&\rightarrow\\ \leftarrow&\mathbf{q}_2^\top&\rightarrow\\ \dots\\ \leftarrow&\mathbf{q}_n^\top&\rightarrow\\ \end{bmatrix}

\mathbf{b}

\mathbf{b}

\mathbf{p} = \hat{x_1}\mathbf{q}_1 + \hat{x_2}\mathbf{q}_2 + \dots + \hat{x_n}\mathbf{q}_n

\mathbf{p} = \hat{x_1}\mathbf{q}_1 + \hat{x_2}\mathbf{q}_2 + \dots + \hat{x_n}\mathbf{q}_n

\hat{x_i} = \mathbf{q}_i^\top\mathbf{b}

\hat{x_i} = \mathbf{q}_i^\top\mathbf{b}

column space of A

\mathbf{q}_1\in \mathbb{R}^6

\mathbf{q}_1\in \mathbb{R}^6

\mathbf{q}_2\in \mathbb{R}^6

\mathbf{q}_2\in \mathbb{R}^6

\mathbf{b}\in \mathbb{R}^6

\mathbf{b}\in \mathbb{R}^6

\mathbf{p}

\mathbf{p}

Recap

A^\top A\mathbf{\hat{x}} = A^\top\mathbf{b}

A^\top A\mathbf{\hat{x}} = A^\top\mathbf{b}

\mathbf{\hat{x}} = (A^\top A)^{-1}A^\top\mathbf{b}

\mathbf{\hat{x}} = (A^\top A)^{-1}A^\top\mathbf{b}

Why do we care? (about orthogonal matrices)

Recap

A^\top A\mathbf{\hat{x}} = A^\top\mathbf{b}

A^\top A\mathbf{\hat{x}} = A^\top\mathbf{b}

\mathbf{\hat{x}} = (A^\top A)^{-1}A^\top\mathbf{b}

\mathbf{\hat{x}} = (A^\top A)^{-1}A^\top\mathbf{b}

If $A = Q$ (a matrix with orthonormal columns)

column space of A

\mathbf{q}_1\in \mathbb{R}^6

\mathbf{q}_1\in \mathbb{R}^6

\mathbf{q}_2\in \mathbb{R}^6

\mathbf{q}_2\in \mathbb{R}^6

\mathbf{b}\in \mathbb{R}^6

\mathbf{b}\in \mathbb{R}^6

\mathbf{p}

\mathbf{p}

\mathbf{p} = \hat{x_1}\mathbf{q}_1 + \hat{x_2}\mathbf{q}_2 + \dots + \hat{x_n}\mathbf{q}_n

\mathbf{p} = \hat{x_1}\mathbf{q}_1 + \hat{x_2}\mathbf{q}_2 + \dots + \hat{x_n}\mathbf{q}_n

\hat{x_i} = \mathbf{q}_i^\top\mathbf{b}

\hat{x_i} = \mathbf{q}_i^\top\mathbf{b}

The co-ordinate of the projection of $\mathbf{b}$ along each basis vector is simply the dot product of that basis vector with $\mathbf{b}$

(as opposed to the complicated formula you see in this box)

$\rightarrow$

An orthonormal basis is the best basis you can hope for!

What if the basis is not orthonormal?

Issue: The columns of $A$ may not be orthonormal

Wishlist: We want an orthonormal basis!

Question: Can we start from some non-orthonormal basis and derive an orthonormal one?

Consequence: The basis vectors for the column space that we get from the pivot columns may not be orthonormal

Observation: We know that multiple basis exist for the same subspace

Answer: Yes, by using Gram-Schmidt process

Gram-Schmidt Process

Given: non-orthonormal vectors $\mathbf{a}_1, \mathbf{a}_2, \dots, \mathbf{a}_n$

Step 1: get orthogonal vectors $\hat{\mathbf{a}}_1, \hat{\mathbf{a}}_2, \dots \hat{\mathbf{a}}_n$

Step 2: get orthonormal vectors $\mathbf{q}_1, \mathbf{q}_2, \dots \mathbf{q}_n$

Step 2 is easy (we will not focus too much on it)

\mathbf{q}_i = \frac{\hat{\mathbf{a}}_i}{||\hat{\mathbf{a}}_i||_2}

\mathbf{q}_i = \frac{\hat{\mathbf{a}}_i}{||\hat{\mathbf{a}}_i||_2}

\mathbf{a}_1

\mathbf{a}_1

\mathbf{a}_2

\mathbf{a}_2

\hat{\mathbf{a}}_1

\hat{\mathbf{a}}_1

\hat{\mathbf{a}}_2

\hat{\mathbf{a}}_2

\mathbf{q}_1

\mathbf{q}_1

\mathbf{q}_2

\mathbf{q}_2

Gram-Schmidt Process

\mathbf{a}_1

\mathbf{a}_1

\mathbf{a}_2

\mathbf{a}_2

\mathbf{p}

\mathbf{p}

\mathbf{e}

\mathbf{e}

$\mathbf{p}$ is the component of $\mathbf{a}_2$ along $\mathbf{a}_1$

$\mathbf{e}$ is the component of $\mathbf{a}_2$ orthogonal to $\mathbf{a}_1$

this is what we want to get rid of

this is what we want to retain

We will retain $\mathbf{a}_1$ as the first basis vector

\hat{\mathbf{a}}_1= \mathbf{a}_1

\hat{\mathbf{a}}_1= \mathbf{a}_1

\hat{\mathbf{a}}_2= \mathbf{e} = \mathbf{a}_2 - {\mathbf{p}}

\hat{\mathbf{a}}_2= \mathbf{e} = \mathbf{a}_2 - {\mathbf{p}}

\therefore \hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

\therefore \hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

Gram-Schmidt Process

\mathbf{a}_1

\mathbf{a}_1

\mathbf{a}_2

\mathbf{a}_2

\mathbf{p}

\mathbf{p}

\mathbf{e}

\mathbf{e}

We want to get rid of the component of $\mathbf{a}_3$ along $\mathbf{a}_1$

(just as we did for a2)

(first basis vector)

\hat{\mathbf{a}}_1= \mathbf{a}_1

\hat{\mathbf{a}}_1= \mathbf{a}_1

\hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

(second basis vector)

\hat{\mathbf{a}}_3 = \mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_3 = \mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

We also want to get rid of the component of $\mathbf{a}_3$ along $\mathbf{a}_2$

(because we want a3 to be orthogonal to a2 also)

- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}_2}

- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}_2}

Gram-Schmidt Process

\mathbf{a}_1=\begin{bmatrix} 1\\ -1\\ 0 \end{bmatrix}

\mathbf{a}_1=\begin{bmatrix} 1\\ -1\\ 0 \end{bmatrix}

\hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_3 = \mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_3 = \mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}_2}

- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}_2}

(Example)

\mathbf{a}_2=\begin{bmatrix} 2\\ 0\\ -2 \end{bmatrix}

\mathbf{a}_2=\begin{bmatrix} 2\\ 0\\ -2 \end{bmatrix}

\mathbf{a}_3=\begin{bmatrix} 3\\ -3\\ 3 \end{bmatrix}

\mathbf{a}_3=\begin{bmatrix} 3\\ -3\\ 3 \end{bmatrix}

\hat{\mathbf{a}}_1 = \mathbf{a}_1=\begin{bmatrix} 1\\ -1\\ 0 \end{bmatrix}

\hat{\mathbf{a}}_1 = \mathbf{a}_1=\begin{bmatrix} 1\\ -1\\ 0 \end{bmatrix}

=\begin{bmatrix} 2\\ 0\\ -2 \end{bmatrix}

=\begin{bmatrix} 2\\ 0\\ -2 \end{bmatrix}

-\frac{2}{2}\begin{bmatrix} 1\\ -1\\ 0 \end{bmatrix}

-\frac{2}{2}\begin{bmatrix} 1\\ -1\\ 0 \end{bmatrix}

=\begin{bmatrix} 1\\ 1\\ -2 \end{bmatrix}

=\begin{bmatrix} 1\\ 1\\ -2 \end{bmatrix}

=\begin{bmatrix} 3\\ -3\\ 3 \end{bmatrix}

=\begin{bmatrix} 3\\ -3\\ 3 \end{bmatrix}

-\frac{6}{2}\begin{bmatrix} 1\\ -1\\ 0 \end{bmatrix}

-\frac{6}{2}\begin{bmatrix} 1\\ -1\\ 0 \end{bmatrix}

-\frac{-6}{6}\begin{bmatrix} 1\\ 1\\ -2 \end{bmatrix}

-\frac{-6}{6}\begin{bmatrix} 1\\ 1\\ -2 \end{bmatrix}

=\begin{bmatrix} 1\\1\\1 \end{bmatrix}

=\begin{bmatrix} 1\\1\\1 \end{bmatrix}

Are we sure they are orthogonal?

\hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_3 = \mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_3 = \mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}_2}

- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}_2}

Multiply by $\hat{\mathbf{a}}_1 ^\top$ on both sides

\hat{\mathbf{a}}_1^\top\hat{\mathbf{a}}_2 = \hat{\mathbf{a}}_1^\top\mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}}_1^\top\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_1^\top\hat{\mathbf{a}}_2 = \hat{\mathbf{a}}_1^\top\mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}}_1^\top\hat{\mathbf{a}_1}

=0

=0

Multiply by $\hat{\mathbf{a}}_1 ^\top$ on both sides

\hat{\mathbf{a}}_1^\top\hat{\mathbf{a}}_3 = \hat{\mathbf{a}}_1^\top\mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}}_1^\top\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_1^\top\hat{\mathbf{a}}_3 = \hat{\mathbf{a}}_1^\top\mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}}_1^\top\hat{\mathbf{a}_1}

- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}}_1^\top\hat{\mathbf{a}_2}

- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}}_1^\top\hat{\mathbf{a}_2}

0

0

=0

=0

Multiply by $\hat{\mathbf{a}}_2 ^\top$ on both sides

\therefore \hat{\mathbf{a}}_1 \perp \hat{\mathbf{a}}_2

\therefore \hat{\mathbf{a}}_1 \perp \hat{\mathbf{a}}_2

\therefore \hat{\mathbf{a}}_1 \perp \hat{\mathbf{a}}_3

\therefore \hat{\mathbf{a}}_1 \perp \hat{\mathbf{a}}_3

\therefore \hat{\mathbf{a}}_2 \perp \hat{\mathbf{a}}_3

\therefore \hat{\mathbf{a}}_2 \perp \hat{\mathbf{a}}_3

\hat{\mathbf{a}}_2^\top\hat{\mathbf{a}}_3 = \hat{\mathbf{a}}_2^\top\mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}}_2^\top\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_2^\top\hat{\mathbf{a}}_3 = \hat{\mathbf{a}}_2^\top\mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}}_2^\top\hat{\mathbf{a}_1}

- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}}_2^\top\hat{\mathbf{a}_2}

- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}}_2^\top\hat{\mathbf{a}_2}

0

0

=0

=0

QR factorisation

\mathbf{a}_1

\mathbf{a}_1

\mathbf{q}_2

\mathbf{q}_2

\mathbf{q_1}

\mathbf{q_1}

\mathbf{a_1} = z_{11}\mathbf{q_1} + z_{12}\mathbf{q_2} + z_{13}\mathbf{q_3}

\mathbf{a_1} = z_{11}\mathbf{q_1} + z_{12}\mathbf{q_2} + z_{13}\mathbf{q_3}

Recap: The co-ordinate of the projection of $\mathbf{a_1}$ along each orthonormal basis vector is simply the dot product of that basis vector with $\mathbf{a_1}$

\mathbf{a}_2

\mathbf{a}_2

z_{11} = \mathbf{q_1}^\top\mathbf{a_1}

z_{11} = \mathbf{q_1}^\top\mathbf{a_1}

\mathbf{a_2} = z_{21}\mathbf{q_1} + z_{22}\mathbf{q_2} + z_{23}\mathbf{q_3}

\mathbf{a_2} = z_{21}\mathbf{q_1} + z_{22}\mathbf{q_2} + z_{23}\mathbf{q_3}

z_{12} = \mathbf{q_2}^\top\mathbf{a_1} = 0

z_{12} = \mathbf{q_2}^\top\mathbf{a_1} = 0

z_{21} = \mathbf{q_1}^\top\mathbf{a_2}

z_{21} = \mathbf{q_1}^\top\mathbf{a_2}

z_{22} = \mathbf{q_2}^\top\mathbf{a_2}

z_{22} = \mathbf{q_2}^\top\mathbf{a_2}

\mathbf{a}_3

\mathbf{a}_3

\mathbf{q_3}

\mathbf{q_3}

\mathbf{a_3} = z_{31}\mathbf{q_1} + z_{32}\mathbf{q_2} + z_{33}\mathbf{q_3}

\mathbf{a_3} = z_{31}\mathbf{q_1} + z_{32}\mathbf{q_2} + z_{33}\mathbf{q_3}

\mathbf{\hat{x}} = (A^\top A)^{-1}A^\top\mathbf{b}

\mathbf{\hat{x}} = (A^\top A)^{-1}A^\top\mathbf{b}

\mathbf{\hat{x}} = Q^\top\mathbf{b}

\mathbf{\hat{x}} = Q^\top\mathbf{b}

z_{13} = \mathbf{q_3}^\top\mathbf{a_1} = 0

z_{13} = \mathbf{q_3}^\top\mathbf{a_1} = 0

z_{23} = \mathbf{q_3}^\top\mathbf{a_2} = 0

z_{23} = \mathbf{q_3}^\top\mathbf{a_2} = 0

\hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

z_{31} = \mathbf{q_1}^\top\mathbf{a_3}

z_{31} = \mathbf{q_1}^\top\mathbf{a_3}

z_{32} = \mathbf{q_2}^\top\mathbf{a_3}

z_{32} = \mathbf{q_2}^\top\mathbf{a_3}

z_{33} = \mathbf{q_3}^\top\mathbf{a_3}

z_{33} = \mathbf{q_3}^\top\mathbf{a_3}

\hat{\mathbf{a}}_3 = \mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}_2}

\hat{\mathbf{a}}_3 = \mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}_2}

QR factorisation

\mathbf{a_1} = z_{11}\mathbf{q_1} + z_{12}\mathbf{q_2} + z_{13}\mathbf{q_3}

\mathbf{a_1} = z_{11}\mathbf{q_1} + z_{12}\mathbf{q_2} + z_{13}\mathbf{q_3}

z_{11} = \mathbf{q_1}^\top\mathbf{a_1}

z_{11} = \mathbf{q_1}^\top\mathbf{a_1}

\mathbf{a_2} = z_{21}\mathbf{q_1} + z_{22}\mathbf{q_2} + z_{23}\mathbf{q_3}

\mathbf{a_2} = z_{21}\mathbf{q_1} + z_{22}\mathbf{q_2} + z_{23}\mathbf{q_3}

z_{12} = \mathbf{q_2}^\top\mathbf{a_1} = 0

z_{12} = \mathbf{q_2}^\top\mathbf{a_1} = 0

z_{21} = \mathbf{q_1}^\top\mathbf{a_2}

z_{21} = \mathbf{q_1}^\top\mathbf{a_2}

z_{22} = \mathbf{q_2}^\top\mathbf{a_2}

z_{22} = \mathbf{q_2}^\top\mathbf{a_2}

\mathbf{a_3} = z_{31}\mathbf{q_1} + z_{32}\mathbf{q_2} + z_{33}\mathbf{q_3}

\mathbf{a_3} = z_{31}\mathbf{q_1} + z_{32}\mathbf{q_2} + z_{33}\mathbf{q_3}

z_{13} = \mathbf{q_3}^\top\mathbf{a_1} = 0

z_{13} = \mathbf{q_3}^\top\mathbf{a_1} = 0

z_{23} = \mathbf{q_3}^\top\mathbf{a_2} = 0

z_{23} = \mathbf{q_3}^\top\mathbf{a_2} = 0

\hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

\hat{\mathbf{a}}_2 = \mathbf{a}_2 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_2}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}

z_{31} = \mathbf{q_1}^\top\mathbf{a_3}

z_{31} = \mathbf{q_1}^\top\mathbf{a_3}

z_{32} = \mathbf{q_2}^\top\mathbf{a_3}

z_{32} = \mathbf{q_2}^\top\mathbf{a_3}

z_{33} = \mathbf{q_3}^\top\mathbf{a_3}

z_{33} = \mathbf{q_3}^\top\mathbf{a_3}

\hat{\mathbf{a}}_3 = \mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}_2}

\hat{\mathbf{a}}_3 = \mathbf{a}_3 - \frac{\hat{\mathbf{a}_1}^\top\mathbf{a}_3}{\hat{\mathbf{a}_1}^\top\hat{\mathbf{a}_1}}\hat{\mathbf{a}_1}- \frac{\hat{\mathbf{a}_2}^\top\mathbf{a}_3}{\hat{\mathbf{a}_2}^\top\hat{\mathbf{a}_2}}\hat{\mathbf{a}_2}

\begin{bmatrix} \mathbf{q_1}^\top\mathbf{a_1}&\mathbf{q_1}^\top\mathbf{a_2}&\mathbf{q_1}^\top\mathbf{a_3}\\ 0&\mathbf{q_2}^\top\mathbf{a_2}&\mathbf{q_2}^\top\mathbf{a_3}\\ 0&0&\mathbf{q_3}^\top\mathbf{a_3} \end{bmatrix}

\begin{bmatrix} \mathbf{q_1}^\top\mathbf{a_1}&\mathbf{q_1}^\top\mathbf{a_2}&\mathbf{q_1}^\top\mathbf{a_3}\\ 0&\mathbf{q_2}^\top\mathbf{a_2}&\mathbf{q_2}^\top\mathbf{a_3}\\ 0&0&\mathbf{q_3}^\top\mathbf{a_3} \end{bmatrix}

\begin{bmatrix} \uparrow&\uparrow&\uparrow\\ \mathbf{q_1}&\mathbf{q_2}&\mathbf{q_3}\\ \downarrow&\downarrow&\downarrow\\ \end{bmatrix}

\begin{bmatrix} \uparrow&\uparrow&\uparrow\\ \mathbf{q_1}&\mathbf{q_2}&\mathbf{q_3}\\ \downarrow&\downarrow&\downarrow\\ \end{bmatrix}

\begin{bmatrix} \uparrow&\uparrow&\uparrow\\ \mathbf{a_1}&\mathbf{a_2}&\mathbf{a_3}\\ \downarrow&\downarrow&\downarrow\\ \end{bmatrix} =

\begin{bmatrix} \uparrow&\uparrow&\uparrow\\ \mathbf{a_1}&\mathbf{a_2}&\mathbf{a_3}\\ \downarrow&\downarrow&\downarrow\\ \end{bmatrix} =

A = QR

CS6015: Linear Algebra and Random Processes

Lecture 13: Orthonormal vectors, orthonormal basis, Gram-Schmidt orthogonalization, QR factorisation

CS6015: Lecture 13

More from Mitesh Khapra