To print higher-resolution math symbols, click the Hi-Res Fonts for Printing button on the jsMath control panel.
No jsMath TeX fonts found -- using image fonts instead.
These may be slow and might not print well.
Use the jsMath control panel to get additional information.
Given two points in the plane, there is a unique line through them. Given
three points in the plane, and not in a line, there is a unique parabola through
them. Given four points in the plane, there is a unique polynomial, of degree 3 or
less, passing through them. And so on. We can prove this result, and give a
procedure for finding the polynomial with the help of Vandermonde matrices
(Section VM).
Theorem IP Interpolating Polynomial Suppose (xiyi)1≤i≤n+1 is a set of
n+1 points in the plane
where the x-coordinates
are all different. Then there is a unique polynomial of degree
n or less,
p(x), such
that p(xi)=yi,
1≤i≤n+1.
□
Proof Write p(x)=a0+a1x+a2x2+⋯+anxn.
To meet the conclusion of the theorem, we desire,
yi=p(xi)=a0+a1xi+a2xi2+⋯+anxin1≤i≤n+1
This is a system of n+1
linear equations in the n+1
variables a0a1a2…an.
The vector of constants in this system is the vector containing the
y-coordinates
of the points. More importantly, the coefficient matrix is
a Vandermonde matrix (Definition VM) built from the
x-coordinates
x1x2x3…xn+1.
Since we have required that these scalars all be different, Theorem NVM tells us
that the coefficient matrix is nonsingular and Theorem NMUS says the
solution for the coefficients of the polynomial exists, and is unique. As a
practical matter, Theorem SNCM provides an expression for the solution.
■
Example PTFP Polynomial through five points Suppose we have the following 5 points in the plane and we wish to pass a degree
4 polynomial through them.
i
1
2
3
4
5
xi
-3
-1
2
3
6
yi
276
16
31
144 2319
The required system of equations has a coefficient matrix that is the Vandermonde matrix where
row i is successive
powers of xi
The unique polynomial passing through a set of points is known as the
interpolating polynomial and it has many uses. Unfortunately, when
confronted with data from an experiment the situation may not be so simple or
clear cut. Read on.
Subsection DF: Data Fitting
Suppose that we have n
real variables, x1x2x3…xn,
that we can measure in an experiment. We believe that these
variables combine, in a linear fashion, to equal another real variable,
y. In
other words, we have reason to believe from our understanding of the experiment,
that
y=a1x1+a2x2+a3x3+⋯+anxn
where the scalars a1a2a3…an
are not known to us, but are instead desirable. We would call
this our model of the situation. Then we run the experiment
m
times, collecting sets of values for the variables of the experiment. For run number
k we might denote
these values as yk,
xk1,
xk2,
xk3, …,
xkn.
If we substitute these values into the model equation, we get
m linear equations in the
unknown coefficients a1a2a3…an.
If m=n,
then we have a square coefficient matrix of the system which might happen to be
nonsingular and there would be a unique solution.
However, more likely mn
(the more data we collect, the greater our confidence in the results)
and the resulting system is inconsistent. It may be that our model is
only an approximate understanding of the relationship between the
xi and
y, or
our measurements are not completely accurate. Still we would like to
understand the situation we are studying, and would like some best answer for
a1a2a3…an.
Let y denote
the vector with yi=yi,
1≤i≤m, let
a denote the
vector with aj=aj,
1≤j≤n, and let
X denote the
m×n matrix
with Xij=xij,
1≤i≤m,
1≤j≤n.
Then the model equation, evaluated with each run of the experiment, translates
to Xa=y.
With the presumption that this system has no solution, we can try
to minimize the difference between the two side of the equation
y−Xa. As a
vector, it is hard to imagine what the minimum might be, so we instead minimize
the square of its norm
S=y−Xaty−Xa
To keep the logical flow accurate, we will define the minimizing value and then
give the proof that it behaves as desired.
Definition LSS Least Squares Solution Given the equation Xa=y,
where X is an
m×n matrix of rank
n, the least squaressolution for a
is XtX−1Xty.
△
Theorem LSMR Least Squares Minimizes Residuals Suppose that X is
an m×n matrix of rank
n. The least squares
solution of Xa=y,
a′=XtX−1Xty,
minimizes the expression
S=y−Xaty−Xa
□
Proof We begin by finding the critical points of
S. In preparation,
let Xj denote
column j
of X,
for 1≤j≤n
and compute partial derivatives with respect to
aj,
1≤j≤n. A matrix product
of the form xty
is a sum of products, so a derivative is a sum of applications of the product
rule,
The first partial derivatives will allow us to find critical points, while second
partial derivatives will be needed to confirm that a critical point will yield a
minimum. Return to the next-to-last expression for the first partial derivative of
S,
For 1≤j≤n, set
∂∂ajS=0. This results
in the n
scalar equations
XjtXa=Xjty1≤j≤n
These n
vector equations can be summarized in the single vector equation,
XtXa=Xty
XtX is an
n×n matrix and since we
have assumed that X
has rank n,
XtX will also
have rank n.
Since XtX
is invertible, we have a critical point at
a′=XtX−1Xty
Is this lone critical point really a minimum? The matrix of
second partial derivatives is constant, and a positive multiple of
XtX.
Theorem CPSM tells us that this matrix is positive semi-definite. In an
advanced course on multivariable calculus, it is shown that a minimum
occurs exactly where the matrix of second partial derivatives is positive
semi-definite. You may have seen this in the two-variable case, where a check
on the positive semi-definiteness is disguised with a determinant of the
2×2 matrix of second
partial derivatives. ■
Subsection EXC: Exercises
T20Theorem IP constructs a unique polynomial through a set of
n+1 points in the
plane, (xiyi)1≤i≤n+1, where
the x-coordinates
are all different. Prove that the expression below is the same polynomial
and include an explanation of the necessity of the hypothesis that the
x-coordinates
are all different.
p(x)=∑i=1n+1yi∏n+1j=1j≠ix−xjxi−xj
This is known as the Lagrange form of the interpolating polynomial. Contributed by Robert Beezer