# Day 33:

The Vandermonde Matrix and

fitting functions to data

Definition. Given a collection of numbers

$x_{1},x_{2},x_{3},\ldots,x_{m}$

and a number $$n\in\N$$, the $$m\times n$$ matrix

$\begin{bmatrix}1 & x_{1} & x_{1}^{2} & \cdots & x_{1}^{n-1}\\[1ex] 1 & x_{2} & x_{2}^{2} & \cdots & x_{2}^{n-1}\\[1ex] 1 & x_{3} & x_{3}^{2} & \cdots & x_{3}^{n-1}\\ \vdots & \vdots & \vdots & & \vdots \\ 1 & x_{m} & x_{m}^{2} & \cdots & x_{m}^{n-1}\\\end{bmatrix}$

is called a Vandermonde matrix.

Theorem. Given a collection of distinct numbers

$x_{1},x_{2},x_{3},\ldots,x_{m}$

and a number $$n\leq m$$, the $$m\times n$$ matrix

$V = \begin{bmatrix}1 & x_{1} & x_{1}^{2} & \cdots & x_{1}^{n-1}\\[1ex] 1 & x_{2} & x_{2}^{2} & \cdots & x_{2}^{n-1}\\[1ex] 1 & x_{3} & x_{3}^{2} & \cdots & x_{3}^{n-1}\\ \vdots & \vdots & \vdots & & \vdots \\ 1 & x_{m} & x_{m}^{2} & \cdots & x_{m}^{n-1}\\\end{bmatrix}$

has a trivial nullspace, that is $$N(V)=\{0\}$$.

Proof. Suppose there is a nonzero vector $x = \begin{bmatrix} a_{0} & a_{1} & a_{2} & \cdots & a_{n-1}\end{bmatrix}^{\top}\in N(V).$ This means that $$Vx=0$$, that is, for each $$i=1,2,\ldots m$$ we have

$a_{0}+a_{1}x_{i}+a_{2}x_{i}^{2}+a_{3}x_{i}^{3}+\cdots+a_{n-1}x_{i}^{n-1}=0.$

Proof continued. Define the polynomial

$f(x) = a_{0}+a_{1}x+a_{2}x^{2}+a_{3}x^{3}+\cdots+a_{n-1}x^{n-1},$

then we see that $$f$$ has $$m$$ distinct roots $$x_{1},x_{2},\ldots,x_{m}$$. Hence, we can factor $$f(x)$$ as follows

$f(x) = (x-x_{1})(x-x_{2})(x-x_{3})\cdots (x-x_{m})g(x)$

where $$g(x)$$ is some polynomial. From this we see that the degree of $$f(x)$$ is at least $$m$$, that is, $$n-1\geq m$$. This contradicts the assumption that $$n\leq m$$. Hence, our assumption at the beginning of the proof must be false, that is, there is no nonzero vector in $$N(V)$$. $$\Box$$

Corollary. If $$(x_{1},y_{1}),(x_{2},y_{2}),\ldots,(x_{n},y_{n})$$ are points in $$\R^{2}$$ with distinct first coordinates, then there is a unique degree $$n-1$$ polynomial $$p(x)$$ such that $p(x_{i}) = y_{i}\quad\text{for }i=1,2,\ldots,n.$

Proof. By the previous theorem, the $$n\times n$$ matrix

$V = \begin{bmatrix}1 & x_{1} & x_{1}^{2} & \cdots & x_{1}^{n-1}\\[1ex] 1 & x_{2} & x_{2}^{2} & \cdots & x_{2}^{n-1}\\ \vdots & \vdots & \vdots & & \vdots \\ 1 & x_{n} & x_{n}^{2} & \cdots & x_{n}^{n-1}\\\end{bmatrix}$

has a trivial nullspace, that is $$N(V)=\{0\}$$. Since this matrix is square, it must be full rank, that is, the columns of $$V$$ are a basis for $$\R^{n}$$. Thus, there is a unique vector$x=\begin{bmatrix} a_{0} & a_{1} & a_{2} & \cdots & a_{n-1}\end{bmatrix}^{\top}$ such that $$Vx=y$$, where $y=\begin{bmatrix} y_{1} & y_{2} & \cdots & y_{n}\end{bmatrix}^{\top}.$

Thus $$p(x)=a_{0}+a_{1}x+\cdots+a_{n-1}x^{n-1}$$ is the desired polynomial. $$\Box$$

Example. Find the unique $$3$$rd degree polynomial $$p(x)$$ such that $\qquad p(1)=2,\quad p(2)=2,\quad p(3)=-1\quad\text{and}\quad p(4)=6.$

$A = \left[\begin{array}{cccc} 1 & 1 & 1 & 1\\ 1 & 2 & 4 & 8\\ 1 & 3 & 9 & 27\\ 1 & 4 & 16 & 64\end{array}\right]\quad\text{and}\quad b=\begin{bmatrix}2\\ 2\\ -1\\ 6\end{bmatrix}$

From the previous theorem the columns of $$A$$ span $$\R^{4}$$ and hence $$Ax=b$$ has a solution. But, $$A$$ is invertible, so there is a unique solution

$x=A^{-1}b = \begin{bmatrix} -14 & \frac{85}{3} & -\frac{29}{2} & \frac{13}{6}\end{bmatrix}^{\top}$

Hence, the unique polynomial is

$p(x) = -14 + \frac{85}{3}x - \frac{29}{2}x^2 + \frac{13}{6}x^3.$

$A = \left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 1 & 1 & 1 & 1\\ 1 & 2 & 4 & 8\\ 1 & 3 & 9 & 27\\ 1 & 4 & 16 & 64\end{array}\right]\quad\text{and}\quad b=\begin{bmatrix}-14\\ 2\\ 2\\ -1\\ 6\end{bmatrix}$

Note that the columns of $$A$$ do not span $$\R^{5}$$ (since there are only $$4$$ of them!) but $$Ax=b$$ still has a solution:

$x=\begin{bmatrix} -14 & \frac{85}{3} & -\frac{29}{2} & \frac{13}{6}\end{bmatrix}^{\top}.$

Hence, the desired polynomial exists:

$p(x) = -14 + \frac{85}{3}x - \frac{29}{2}x^2 + \frac{13}{6}x^3.$

Example. Find a $$3$$rd degree polynomial $$p(x)$$  such that $\qquad p(1)=2,\quad p(2)=2,\quad p(3)=-1\quad\text{and}\quad p(4)=6.$

$$p(0)=-14,$$

$A = \left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 1 & 1 & 1 & 1\\ 1 & 2 & 4 & 8\\ 1 & 3 & 9 & 27\\ 1 & 4 & 16 & 64\end{array}\right]\quad\text{and}\quad b=\begin{bmatrix}-13\\ 2\\ 2\\ -1\\ 6\end{bmatrix}$

Example. Find a $$3$$rd degree polynomial $$p(x)$$  such that $\qquad p(1)=2,\quad p(2)=2,\quad p(3)=-1\quad\text{and}\quad p(4)=6.$

$$p(0)=-13,$$

Now, $$Ax=b$$ has no solution! So, there is no $$3$$rd degree polynomial $$p(x)$$ with all of the desired properties.

$A = \left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 1 & 1 & 1 & 1\\ 1 & 2 & 4 & 8\\ 1 & 3 & 9 & 27\\ 1 & 4 & 16 & 64\end{array}\right]\quad\text{and}\quad b=\begin{bmatrix}-14.01\\ 1.98\\ 2.03\\ -0.99\\ 6\end{bmatrix}$

Example. Find a $$3$$rd degree polynomial $$p(x)$$  such $p(0)=-14.01,\ p(1)=1.98,\ p(2)=2.03,\ p(3)=-0.99,\text{ and } p(4)=6.$

We still see that $$Ax=b$$ has no solution! So, there is no $$3$$rd degree polynomial $$p(x)$$ with all of the desired properties.

But, this is only because of "noise" in $$b$$. If we find the least squares solution to $$Ax=b$$,

$\hat{x} = A^{+}b = \begin{bmatrix} -14.013 & 28.3225 & -14.4800 & 2.1625\end{bmatrix}^{\top}$

Then we see that the "closest" $$3$$rd degree polynomial is $p(x) = -14.013+28.3225x-14.4800x^2+2.1625x^3,$ where  $p(0)=-14.014,\ p(1) = 1.992,\ p(2)=2.012,\ p(3)=-0.978,\ p(4)=5.997$

### What degree to choose?

$-0.0367+0.2921x-0.6828 x^2+ 0.5434x^3+ 0.3656x^4$

### What if polynomials don't fit?

Consider the collection of points in $$\R^{2}$$:

$(x_{1},y_{1}),(x_{2},y_{2}),\ldots,(x_{n},y_{n})$

We want to find the function $$f$$ of the form $f(x) = a_{0}+\sum_{k=1}^{m}a_{k}\cos(2k\pi x)+\sum_{k=1}^{m}b_{k}\sin(2k\pi x)$ so that $$\displaystyle{\sum_{i=1}^{n}|f(x_{i})-y_{i}|^{2}}$$ is as small as possible.

Set

We wish to find the least squares solution to $$Ax=b$$.

$A= \begin{bmatrix}1 & \cos(2\pi x_{1}) & \sin(2\pi x_{1}) & \cos(4\pi x_{1}) & \sin(4\pi x_{1}) & \cdots & \cos(2m\pi x_{1}) & \sin(2m\pi x_{1})\\1 & \cos(2\pi x_{2}) & \sin(2\pi x_{2}) & \cos(4\pi x_{2}) & \sin(4\pi x_{2}) & \cdots & \cos(2m\pi x_{2}) & \sin(2m\pi x_{2})\\1 & \cos(2\pi x_{3}) & \sin(2\pi x_{3}) & \cos(4\pi x_{3}) & \sin(4\pi x_{3}) & \cdots & \cos(2m\pi x_{3}) & \sin(2m\pi x_{3})\\ \vdots & \vdots & \vdots & \vdots & \vdots & & \vdots & \vdots \\ 1 & \cos(2\pi x_{n}) & \sin(2\pi x_{n}) & \cos(4\pi x_{n}) & \sin(4\pi x_{n}) & \cdots & \cos(2m\pi x_{n}) & \sin(2m\pi x_{n})\end{bmatrix}$

$b= \begin{bmatrix}y_{1} & y_{2} & y_{3} & \cdots & y_{n}\end{bmatrix}^{\top}$

$f(x)=0.0515-0.9225\cos(2\pi x)+2.0311\sin(2\pi x)$

$f(x) = -0.0058-1.0120\cos(2\pi x)+2.0076\sin(2\pi x)$

$-0.0074\cos(4\pi x)+0.9959\sin(4\pi x)$

$f(x) = -0.0055-1.0111\cos(2\pi x)+2.0075\sin(2\pi x)-0.0073\cos(4\pi x)$

$+0.9968\sin(4\pi x) + 0.0081\cos(6\pi x) - 0.0043\sin(6\pi x)$

### Fitting trigonometric polynomials

Consider the collection of points in $$\R^{2}$$:

$(x_{1},y_{1}),(x_{2},y_{2}),\ldots,(x_{n},y_{n})$

Given a collection of functions $$f_{1},f_{2},\ldots,f_{m},$$ we want to find the function $$f$$ of the form $f(x) = \sum_{k=1}^{m}a_{k}f_{k}(x)$ so that $$\displaystyle{\sum_{i=1}^{n}|f(x_{i})-y_{i}|^{2}}$$ is as small as possible.

Set

We wish to find the least squares solution to $$Ax=b$$.

$A= \begin{bmatrix}f_{1}(x_{1}) & f_{2}(x_{1}) & f_{3}(x_{1}) & \cdots & f_{m}(x_{1})\\ f_{1}(x_{2}) & f_{2}(x_{2}) & f_{3}(x_{2}) & \cdots & f_{m}(x_{2})\\ f_{1}(x_{3}) & f_{2}(x_{3}) & f_{3}(x_{3}) & \cdots & f_{m}(x_{3})\\ \vdots & \vdots & \vdots & & \vdots\\ f_{1}(x_{n}) & f_{2}(x_{n}) & f_{3}(x_{n}) & \cdots & f_{m}(x_{n})\end{bmatrix} \quad\text{and}\quad b=\begin{bmatrix}y_{1}\\ y_{2}\\ y_{3}\\ \vdots\\ y_{n}\end{bmatrix}$

### Polynomials and trig alone don't work:

Best fit function of the form:

$$\displaystyle{f(x) = a_{0}+a_{1}x}$$

### Polynomials and trig alone don't work:

Best fit function of the form:

$$\displaystyle{f(x) = a_{0}+a_{1}x+a_{2}x^2}$$

### Polynomials and trig alone don't work:

Best fit function of the form:

$$\displaystyle{f(x) = a_{0}+a_{1}x+a_{2}x^2+a_{3}x^{3}}$$

### Polynomials and trig alone don't work:

Best fit function of the form:

$$\displaystyle{f(x) = a_{0}+a_{1}x+a_{2}x^2+a_{3}x^{3}+a_{4}x^4}$$

### Polynomials and trig alone don't work:

Best fit function of the form:

$$\displaystyle{f(x) = \sum_{k=0}^{5}a_{k}x^k}$$

### Polynomials and trig alone don't work:

Best fit function of the form:

$$\displaystyle{f(x) = \sum_{k=0}^{10}a_{k}x^k}$$

### Polynomials and trig alone don't work:

Best fit function of the form:

$$\displaystyle{f(x) = \sum_{k=0}^{20}a_{k}x^k}$$

### Polynomials and trig alone don't work:

Best fit function of the form:

$f(x) = a_{0} + a_{1}\cos(2\pi x)+b_{1}\sin(2\pi x)$

### Polynomials and trig alone don't work:

Best fit function of the form:

$f(x) = a_{0} + a_{1}\cos(2\pi x)+b_{1}\sin(2\pi x)+ a_{2}\cos(4\pi x)+b_{2}\sin(4\pi x)$

### Polynomials and trig alone don't work:

Best fit function of the form:

$f(x) = a_{0} + \sum_{k=1}^{3}\big(a_{k}\cos(2k\pi x)+b_{k}\sin(2k\pi x)\big)$

### Polynomials and trig alone don't work:

Best fit function of the form:

$f(x) = a_{0} + \sum_{k=1}^{4}\big(a_{k}\cos(2k\pi x)+b_{k}\sin(2k\pi x)\big)$

### Polynomials and trig alone don't work:

Best fit function of the form:

$f(x) = a_{0} + \sum_{k=1}^{5}\big(a_{k}\cos(2k\pi x)+b_{k}\sin(2k\pi x)\big)$

### Polynomials and trig alone don't work:

Best fit function of the form:

$f(x) = a_{0} + \sum_{k=1}^{10}\big(a_{k}\cos(2k\pi x)+b_{k}\sin(2k\pi x)\big)$

### Polynomials and trig alone don't work:

Best fit function of the form:

$f(x) = a_{0} + c_{1}x + a_{1}\cos(2\pi x)+b_{1}\sin(2\pi x)+ a_{2}\cos(4\pi x)+b_{2}\sin(4\pi x)$

### Orthonormal bases

Thus, if $$\{v_{1},v_{2},\ldots,v_{n}\}$$ is an ONB for $$\R^{n}$$, then

$v = \sum_{i=1}^{n}(v_{i}\cdot v) v_{i}.$

If $$I\subset \{1,2,\ldots,n\}$$, then we define

$v_{I} = \sum_{i\in I}(v_{i}\cdot v) v_{i}$

and

$\|v-v_{I}\|^{2} = \left\|\sum_{i=1}^{n}(v_{i}\cdot v)v_{i} - \sum_{i\in I}(v_{i}\cdot v)v_{i}\right\|^{2} = \Bigg\|\sum_{i\notin I}(v_{i}\cdot v)v_{i}\Bigg\|^{2} =\sum_{i\notin I}|v_{i}\cdot v|^{2}$

In particular, let $$I\subset \{1,2,\ldots,n\}$$ be the $$k$$ indices $$i$$ such that $$|v_{i}\cdot v|$$ are largest. Then $$\|v-v_{I}\|$$ is as small as possible among all sets $$I$$.

### Orthonormal bases

The set $$\{v_{1},v_{2},v_{3},v_{4}\}$$ where

$v_{1} = \frac{1}{2}\begin{bmatrix} 1\\ 1\\ 1\\ 1\end{bmatrix},\ v_{2} = \frac{1}{2}\begin{bmatrix} \phantom{-}1\\ -1\\ \phantom{-}1\\ -1\end{bmatrix},\ v_{3} = \frac{1}{2}\begin{bmatrix} \phantom{-}1\\ \phantom{-}1\\ -1\\ -1\end{bmatrix},\ v_{4} = \frac{1}{2}\begin{bmatrix} \phantom{-}1\\ -1\\ -1\\ \phantom{-}1\end{bmatrix},$

is an orthonormal basis for $$\R^{4}$$. We see that

$$v = \begin{bmatrix} 1\\ 1\\ 2\\ 4 \end{bmatrix} = 4v_{1} - v_{2}-2v_{3}+v_{4}$$

$= 4\left(\frac{1}{2}\begin{bmatrix} 1\\ 1\\ 1\\ 1\end{bmatrix}\right) + (-1)\left(\frac{1}{2}\begin{bmatrix} \phantom{-}1\\ -1\\ \phantom{-}1\\ -1\end{bmatrix}\right) + (-2)\left(\frac{1}{2} \begin{bmatrix} \phantom{-}1\\ \phantom{-}1\\ -1\\ -1\end{bmatrix}\right) + (1)\left(\frac{1}{2}\begin{bmatrix} \phantom{-}1\\ -1\\ -1\\ \phantom{-}1\end{bmatrix}\right)$

Take $$I=\{1,3\}$$ and we have $$\|v - v_{I}\| = \sqrt{2}$$.

### Orthonormal bases

To store the vector

$v = \begin{bmatrix} 1\\ 1\\ 2\\ 4 \end{bmatrix} = 4v_{1} - v_{2}-2v_{3}+v_{4}$

we need to store 4 numbers.

$v_{I} = 4v_{1} - 2v_{3} = \begin{bmatrix} 1\\ 1\\ 3\\ 3 \end{bmatrix}.$

So, we only need to store TWO numbers $$4$$ and $$-2$$.

This is not quite true, why?

### Image compression using a "nice" ONB

Thinking of an image as a large set of vectors

$p_{1} = \begin{bmatrix} p_{11}\\ p_{21}\\ p_{31}\\ \vdots\\ p_{m1} \end{bmatrix},\ p_{2} = \begin{bmatrix} p_{12}\\ p_{22}\\ p_{32}\\ \vdots\\ p_{m2} \end{bmatrix},\ \ldots,\ p_{n} = \begin{bmatrix} p_{1n}\\ p_{2n}\\ p_{3n}\\ \vdots\\ p_{mn} \end{bmatrix}.$

(Think of each "patch" as a vector.)

For each $$p_{j}$$ we expand with respect to an ONB $$\{v_{1},\ldots,v_{m}\}$$ and we have

Now, we throw away all but the $$k$$ term with the largest coefficients (in absolute value), and we get our approximations:

$p_{j}^{(k)} = \sum_{i\in I_{j}}(v_{i}\cdot p_{j})v_{i}$

$p_{j} = \sum_{i=1}^{m}(v_{i}\cdot p_{j}) v_{i}$

### Image compression using a "nice" ONB

In order to store

$p_{j}^{(k)} = \sum_{i\in I_{j}}(v_{i}\cdot p_{j})v_{i}$

We store the pairs $$(v_{i}\cdot p_{j}, i)$$ where $$i\in I$$. Since $$I$$ has $$k$$ elements, this means that we are storing $$2k$$ numbers.

Example. If

$v = 10v_{1} - v_{2} + 6v_{3} +\frac{1}{3}v_{4} + \frac{1}{10}v_{5} - 14v_{6}+\frac{1}{2}v_{7} + v_{8} - \frac{2}{3}v_{9}+\frac{2}{5}v_{10}$

Then, keeping the thee largest coefficients we get

$v^{(3)} = 10v_{1}+ 6v_{3}- 14v_{6}$

We store the pairs $$(10,1), (6,3)$$ and $$(-14,6)$$ and we can recover the approximation $$v^{(3)}$$.

By John Jasper

• 340