Section VM Vandermonde Matrix

Alexandre-Théophile Vandermonde was a French mathematician in the 1700’s who was among the first to write about basic properties of the determinant (such as the effect of swapping two rows). However, the determinant that bears his name (Theorem DVM) does not appear in any of his four published mathematical papers.

Definition VM
Vandermonde Matrix
An square matrix of size n, A, is a Vandermonde matrix if there are scalars, {x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{n} such that {\left [A\right ]}_{ij} = {x}_{i}^{j−1}, 1 ≤ i ≤ n, 1 ≤ j ≤ n. △

\eqalignno{ A & = \left [\array{ 1& 2 & 4 & 8\cr 1&−3 & 9 &−27 \cr 1& 1 & 1 & 1\cr 1& 4 &16 & 64 } \right ] & & }

is a Vandermonde matrix since it meets the definition with {x}_{1} = 2, {x}_{2} = −3, {x}_{3} = 1, {x}_{4} = 4. ⊠

Vandermonde matrices are not very interesting as numerical matrices, but instead appear more often in proofs and applications where the scalars {x}_{i} are carried as symbols. Two such applications are in the sections on secret-sharing (Section SAS) and curve-fitting (Section CF). Principally, we would like to know when Vandermonde matrices are nonsingular, and the most convenient way to check this is by determining when the determinant is nonzero (Theorem SMZD). As a bonus, the determinant of a Vandermonde matrix has an especially pleasing formula.

Theorem DVM
Determinant of a Vandermonde Matrix
Suppose that A is a Vandermonde matrix of size n built with the scalars {x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{n}. Then

\eqalignno{ \mathop{ det} \left (A\right ) & ={ \mathop{∏ }}_{1≤i<j≤n}\left ({x}_{j} − {x}_{i}\right ) & & }

Proof The proof is by induction (Technique I) on n, the size of the matrix. An empty product for a 1 × 1 matrix might make a good base case, but we’ll start at n = 2 instead. For a 2 × 2 Vandermonde matrix, we have

\eqalignno{ \mathop{ det} \left (A\right ) & = \left \vert \array{ 1&{x}_{1} \cr 1&{x}_{2}} \right \vert = {x}_{2} − {x}_{1} ={ \mathop{∏ }}_{1≤i<j≤2}\left ({x}_{j} − {x}_{i}\right ) & & }

For the induction step we will perform row operations on A to obtain the determinant of A as multiple of the determinant of an (n − 1) × (n − 1) Vandermonde matrix. the notation in this theorem tens to obscure your intuition about the changes effected by various row and column manipulations. Construct a 4 × 4 Vandermonde matrix with four symbols as the scalars ({x}_{1}, {x}_{2}, {x}_{2}, {x}_{4}, or perhaps a, b, c, d) and play along with the example as you study the proof.

First we convert most of the first column to zeros. Subtract row n from each of the other n − 1 rows to form a matrix B. By Theorem DRCMA, B has the same determinant as A. The entries of B, in the first n − 1 rows, i.e. for 1 ≤ i ≤ n − 1, 1 ≤ j ≤ n − 1, are

\eqalignno{ {\left [B\right ]}_{ij} & = {x}_{i}^{j−1} − {x}_{ n}^{j−1} = \left ({x}_{ i} − {x}_{n}\right ){ \mathop{∑ }}_{k=0}^{j−2}{x}_{ i}^{j−2−k}{x}_{ n}^{k} & & }

As the elements of row i, 1 ≤ i ≤ n − 1, have the common factor \left ({x}_{i} − {x}_{n}\right ), we form the new matrix C that differs from B by the removal of this factor from each of the first n − 1 rows. This will change the determinant, as we will track carefully in a moment. We also have a first column with zeros in each location, except row n, so we can use it for a column expansion computation of the determinant. We now know,

\eqalignno{ \mathop{ det} \left (A\right )& =\mathop{ det} \left (B\right ) &&\text{@(a href="fcla-jsmath-2.22li45.html#theorem.DRCMA")Theorem DRCMA@(/a)}&&&& \cr & = ({x}_{1} − {x}_{n})({x}_{2} − {x}_{n})\mathrel{⋯}({x}_{n−1} − {x}_{n})\mathop{ det} \left (C\right ) &&\text{@(a href="fcla-jsmath-2.22li45.html#theorem.DRCM")Theorem DRCM@(/a)} &&&& \cr & = ({x}_{1} − {x}_{n})({x}_{2} − {x}_{n})\mathrel{⋯}({x}_{n−1} − {x}_{n})(1){(−1)}^{n+1}\mathop{ det} \left (C\left (n − 1|1\right )\right )&&\text{@(a href="fcla-jsmath-2.22li44.html#theorem.DEC")Theorem DEC@(/a)} &&&& \cr & = ({x}_{1} − {x}_{n})({x}_{2} − {x}_{n})\mathrel{⋯}({x}_{n−1} − {x}_{n}){(−1)}^{n−1}\mathop{ det} \left (C\left (n − 1|1\right )\right ) && && \cr & = ({x}_{n} − {x}_{1})({x}_{n} − {x}_{2})\mathrel{⋯}({x}_{n} − {x}_{n−1})\mathop{ det} \left (C\left (n − 1|1\right )\right ) && && }

For convenience, denote D = C\left (n − 1|1\right ). Entries of this matrix are similar to those of B, but the factors used to build C are gone, and since the first column is gone, there is a slight re-indexing relative to the columns. For 1 ≤ i ≤ n − 1, 1 ≤ j ≤ n − 1,

\eqalignno{ {\left [D\right ]}_{ij} & ={ \mathop{∑ }}_{k=0}^{j−1}{x}_{ i}^{j−1−k}{x}_{ n}^{k} & & }

We will perform many column operations on the matrix D, always of the type where we multiply a column by a scalar and add the result to another column. As such, Theorem DRCM insures that the determinant will remain constant. We will work column by column, left to right, to convert D into a Vandermonde matrix with scalars {x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{n−1}. More precisely, we will build a sequence of matrices D = {D}_{1}, {D}_{2}, …, {D}_{n−1}, where each obtainable from the previous by a sequence of determinant-preserving column operations and the first ℓ columns of {D}_{ℓ} are the first ℓ columns of a Vandermonde matrix with scalars {x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{n−1}. We could establish this claim by induction (Technique I) on ℓ if we were to expand the claim to specify the exact values of the final n − 1 − ℓ columns as well. Since the claim is that matrices with certain properties exist, we will instead establish the claim by constructing the desired matrices one-by-one procedurally. The extension to an inductive proof should be clear, but not especially illuminating.

Set {D}_{1} = D to begin, and note that the entries of the first column of {D}_{1} are, for 1 ≤ i ≤ n − 1,

\eqalignno{ {\left [{D}_{1}\right ]}_{i1} & ={ \mathop{∑ }}_{k=0}^{1−1}{x}_{ i}^{1−1−k}{x}_{ n}^{k} = 1 = {x}_{ i}^{1−1} & & }

So the first column of {D}_{1} has the properties we desire. We will use this column of all 1’s to remove the highest power of {x}_{n} from each of the remaining columns and so build {D}_{2}. Precisely, perform the n − 2 column operations where column 1 is multiplied by {x}_{n}^{j−1} and subtracted from column j, for 2 ≤ j ≤ n − 1. Call the result {D}_{2}, and examine its entries in columns 2 through n − 1. For 1 ≤ i ≤ n − 1, 2 ≤ j ≤ n − 1,

\eqalignno{ {\left [{D}_{2}\right ]}_{ij} & = −{x}_{n}^{j−1}{\left [{D}_{ 1}\right ]}_{i1} +{ \left [{D}_{1}\right ]}_{ij} & & \cr & = −{x}_{n}^{j−1}(1) +{ \mathop{∑ }}_{k=0}^{j−1}{x}_{ i}^{j−1−k}{x}_{ n}^{k} & & \cr & = −{x}_{n}^{j−1} + {x}_{ i}^{j−1−(j−1)}{x}_{ n}^{j−1} +{ \mathop{∑ }}_{k=0}^{j−2}{x}_{ i}^{j−1−k}{x}_{ n}^{k} & & \cr & ={ \mathop{∑ }}_{k=0}^{j−2}{x}_{ i}^{j−1−k}{x}_{ n}^{k} & & }

\eqalignno{ {\left [{D}_{2}\right ]}_{i2} & ={ \mathop{∑ }}_{k=0}^{2−2}{x}_{ i}^{2−1−k}{x}_{ n}^{k} = {x}_{ i}^{1} = {x}_{ i}^{2−1} & & }

Now, form {D}_{3}. Perform the n − 3 column operations where column 2 of {D}_{2} is multiplied by {x}_{n}^{j−2} and subtracted from column j, for 3 ≤ j ≤ n − 1. The result is {D}_{3}, whose entries we now compute. For 1 ≤ i ≤ n − 1,

\eqalignno{ {\left [{D}_{3}\right ]}_{ij} & = −{x}_{n}^{j−2}{\left [{D}_{ 2}\right ]}_{i2} +{ \left [{D}_{2}\right ]}_{ij} & & \cr & = −{x}_{n}^{j−2}{x}_{ i}^{1} +{ \mathop{∑ }}_{k=0}^{j−2}{x}_{ i}^{j−1−k}{x}_{ n}^{k} & & \cr & = −{x}_{n}^{j−2}{x}_{ i}^{1} + {x}_{ i}^{j−1−(j−2)}{x}_{ n}^{j−2} +{ \mathop{∑ }}_{k=0}^{j−3}{x}_{ i}^{j−1−k}{x}_{ n}^{k} & & \cr & ={ \mathop{∑ }}_{k=0}^{j−3}{x}_{ i}^{j−1−k}{x}_{ n}^{k} & & }

\eqalignno{ {\left [{D}_{3}\right ]}_{i3} & ={ \mathop{∑ }}_{k=0}^{3−3}{x}_{ i}^{3−1−k}{x}_{ n}^{k} = {x}_{ i}^{2} = {x}_{ i}^{3−1} & & }

We could continue this procedure n − 4 more times, eventually totaling {1\over 2}\left ({n}^{2} − 3n + 2\right ) column operations, and arriving at {D}_{n−1}, the Vandermonde matrix of size n − 1 built from the scalars {x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{n−1}. Informally, we chop off the last term of every sum, until a single term is left in a column, and it is of the right form for the Vandermonde matrix. This desired column is then used in the next iteration to chop off some more final terms for columns to the right. Now we can apply our induction hypothesis to the determinant of {D}_{n−1} and arrive at an expression for \mathop{ det} A,

\eqalignno{ \mathop{ det} \left (A\right ) & =\mathop{ det} \left (C\right ) & & \cr & ={ \mathop{∏ }}_{k=1}^{n−1}\left ({x}_{ n} − {x}_{k}\right )\mathop{ det} \left (D\right ) & & \cr & ={ \mathop{∏ }}_{k=1}^{n−1}\left ({x}_{ n} − {x}_{k}\right )\mathop{ det} \left ({D}_{n−1}\right ) & & \cr & ={ \mathop{∏ }}_{k=1}^{n−1}\left ({x}_{ n} − {x}_{k}\right ){ \mathop{∏ }}_{1≤i<j≤n−1}\left ({x}_{j} − {x}_{i}\right ) & & \cr & ={ \mathop{∏ }}_{1≤i<j≤n}\left ({x}_{j} − {x}_{i}\right ) & & }

Before we had Theorem DVM we could see that if two of the scalar values were equal, then the Vandermonde matrix would have two equal rows and hence be singular (Theorem DERC, Theorem SMZD). But with this expression for the determinant, we can establish the converse.

Theorem NVM
Nonsingular Vandermonde Matrix
A Vandermonde matrix of size n with scalars {x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{n} is nonsingular if and only if the scalars are all different. □

Proof Let A denote the Vandermonde matrix with scalars {x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{n}. By Theorem SMZD, A is nonsingular if and only if the determinant of A is nonzero. The determinant is given by Theorem DVM, and this product is nonzero if and only if each term of the product is nonzero. This condition translates to {x}_{i} − {x}_{j}\mathrel{≠}0 whenever i\mathrel{≠}j. In other words, the matrix is nonsingular if and only if the scalars are all different. ■