Subsection

{\sc\large This Section is Incomplete}

\bigskip Given two points in the plane, there is a unique line through them. Given three points in the plane, and not in a line, there is a unique parabola through them. Given four points in the plane, there is a unique polynomial, of degree 3 or less, passing through them. And so on. We can prove this result, and give a procedure for finding the polynomial with the help of Vandermonde matrices (Section VM).
Theorem IP Interpolating Polynomial
Suppose $\setparts{(x_i,\,y_i)}{1\leq i\leq n+1}$ is a set of $n+1$ points in the plane where the $x$-coordinates are all different. Then there is a unique polynomial of degree $n$ or less, $p(x)$, such that $p(x_i)=y_i$, $1\leq i\leq n+1$.

Proof

Example PTFP Polynomial through five points
The unique polynomial passing through a set of points is known as the interpolating polynomial and it has many uses. Unfortunately, when confronted with data from an experiment the situation may not be so simple or clear cut. Read on. \subsect{DF}{Data Fitting} Suppose that we have $n$ real variables, $\scalarlist{x}{n}$, that we can measure in an experiment. We believe that these variables combine, in a linear fashion, to equal another real variable, $y$. In other words, we have reason to believe from our understanding of the experiment, that \begin{align*} y&=a_1x_1+a_2x_2+a_3x_3+\cdots+a_nx_n \end{align*} where the scalars $\scalarlist{a}{n}$ are not known to us, but are instead desirable. We would call this our model of the situation. Then we run the experiment $m$ times, collecting sets of values for the variables of the experiment. For run number $k$ we might denote these values as $y_k$, $x_{k1}$, $x_{k2}$, $x_{k3}$, \dots, $x_{kn}$. If we substitute these values into the model equation, we get $m$ linear equations in the unknown coefficients $\scalarlist{a}{n}$. If $m=n$, then we have a square coefficient matrix of the system which might happen to be nonsingular and there would be a unique solution.

However, more likely $m>n$ (the more data we collect, the greater our confidence in the results) and the resulting system is inconsistent. It may be that our model is only an approximate understanding of the relationship between the $x_i$ and $y$, or our measurements are not completely accurate. Still we would like to understand the situation we are studying, and would like some best answer for $\scalarlist{a}{n}$.

Let $\vect{y}$ denote the vector with $\vectorentry{\vect{y}}{i}=y_i$, $1\leq i\leq m$, let $\vect{a}$ denote the vector with $\vectorentry{\vect{a}}{j}=a_j$, $1\leq j\leq n$, and let $X$ denote the $m\times n$ matrix with $\matrixentry{X}{ij}=x_{ij}$, $1\leq i\leq m$, $1\leq j\leq n$. Then the model equation, evaluated with each run of the experiment, translates to $X\vect{a}=\vect{y}$. With the presumption that this system has no solution, we can try to minimize the difference between the two side of the equation $\vect{y}-X\vect{a}$. As a vector, it is hard to imagine what the minimum might be, so we instead minimize the square of its norm \begin{align*} S=\transpose{\left(\vect{y}-X\vect{a}\right)}\left(\vect{y}-X\vect{a}\right) \end{align*} To keep the logical flow accurate, we will define the minimizing value and then give the proof that it behaves as desired.
Definition LSS Least Squares Solution

Given the equation $X\vect{a}=\vect{y}$, where $X$ is an $m\times n$ matrix of rank $n$, the least squares solution for $\vect{a}$ is $\inverse{\left(\transpose{X}X\right)}\transpose{X}\vect{y}$.

$\square$

Theorem LSMR Least Squares Minimizes Residuals
Suppose that $X$ is an $m\times n$ matrix of rank $n$. The least squares solution of $X\vect{a}=\vect{y}$, $\vect{a}^\prime=\inverse{\left(\transpose{X}X\right)}\transpose{X}\vect{y}$, minimizes the expression \begin{align*} S=\transpose{\left(\vect{y}-X\vect{a}\right)}\left(\vect{y}-X\vect{a}\right) \end{align*}

Proof