Section ROD Rank One Decomposition

Our first decomposition applies only to diagonalizable (Definition DZM) matrices, and yields a decomposition into a sum of very simple matrices.

Theorem ROD
Rank One Decomposition
Suppose that A is a diagonalizable matrix of size n and rank r. Then there are r square matrices {A}_{1},\kern 1.95872pt {A}_{2},\kern 1.95872pt {A}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {A}_{r}, each of size n and rank 1 such that

\eqalignno{ A & = {A}_{1} + {A}_{2} + {A}_{3} + \mathrel{⋯} + {A}_{r} & & }

Furthermore, if {λ}_{1},\kern 1.95872pt {λ}_{2},\kern 1.95872pt {λ}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {λ}_{r} are the nonzero eigenvalues of A, then there are two sets of r linearly independent vectors from {ℂ}^{n},

\eqalignno{ X & = \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{r}\right \} &Y & = \left \{{y}_{1},\kern 1.95872pt {y}_{2},\kern 1.95872pt {y}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {y}_{r}\right \} & & & & }

Proof The proof is constructive. Generally, we will diagonalize A, creating a nonsingular matrix S and a diagonal matrix D. Then we split up the diagonal matrix into a sum of matrices with a single nonzero entry (on the diagonal). This fundamentally creates the decomposition in the statement of the theorem, the remainder is just bookkeeping. The vectors in X and Y will result from the columns of S and the rows of {S}^{−1}.

Let {λ}_{1},\kern 1.95872pt {λ}_{2},\kern 1.95872pt {λ}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {λ}_{n} be the eigenvalues of A (repeated according to their algebraic multiplicity). If A has rank r, then \mathop{ dim}\nolimits \left (N\kern -1.95872pt \left (A\right )\right ) = n − r (Theorem RPNC). The null space of A is the eigenspace of the eigenvalue λ = 0 (Theorem EMNS), so it follows that the algebraic multiplicity of λ = 0 is n − r, {α}_{A}\left (0\right ) = n − r. Presume that the complete list of eigenvalues is ordered so that {λ}_{k} = 0 for r + 1 ≤ k ≤ n.

Since A is hypothesized to be diagonalizable, there exists a diagonal matrix D and an invertible matrix S, such that D = {S}^{−1}AS. We can rearrange tis equation to read, A = SD{S}^{−1}. Also, the proof of Theorem DC says that the diagonal elements of D are the eigenvalues of A and we have the flexibility to assume they lie on the diagonal in the same order as we have specified above. Now, let {X}^{∗} = \left \{{x}_{ 1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{n}\right \} be the columns of S, and let {Y }^{∗} = \left \{{y}_{ 1},\kern 1.95872pt {y}_{2},\kern 1.95872pt {y}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {y}_{n}\right \} be the rows of {S}^{−1} converted to column vectors. With little motivation other than the statement of the theorem, define size n matrices {A}_{k}, 1 ≤ k ≤ n by {A}_{k} = {λ}_{k}{x}_{k}{y}_{k}^{t}. Finally, let {D}_{k} be the size n matrix that is totally zero, other than having {λ}_{k} in row k and column k.

\eqalignno{ {\left [A\right ]}_{ij} & ={ \left [SD{S}^{−1}\right ]}_{ ij} & &\text{@(a href="fcla-jsmath-2.23li49.html#definition.DZM")Definition DZM@(/a)} & & & & \cr & ={ \left [S\left ({\mathop{∑ }}_{k=1}^{n}{D}_{ k}\right ){S}^{−1}\right ]}_{ ij} & &\text{@(a href="fcla-jsmath-2.23li30.html#definition.MA")Definition MA@(/a)} & & & & \cr & ={ \left [S\left ({\mathop{∑ }}_{k=1}^{n}{D}_{ k}{S}^{−1}\right )\right ]}_{ ij} & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMDAA")Theorem MMDAA@(/a)} & & & & \cr & ={ \left [{\mathop{∑ }}_{k=1}^{n}S{D}_{ k}{S}^{−1}\right ]}_{ ij} & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMDAA")Theorem MMDAA@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{n}{\left [S{D}_{ k}{S}^{−1}\right ]}_{ ij} & &\text{@(a href="fcla-jsmath-2.23li30.html#definition.MA")Definition MA@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{n}{ \mathop{∑ }}_{ℓ=1}^{n}{\left [S{D}_{ k}\right ]}_{iℓ}{\left [{S}^{−1}\right ]}_{ ℓj} & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.EMP")Theorem EMP@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{n}{ \mathop{∑ }}_{ℓ=1}^{n}{ \mathop{∑ }}_{p=1}^{n}{\left [S\right ]}_{ ip}{\left [{D}_{k}\right ]}_{pℓ}{\left [{S}^{−1}\right ]}_{ ℓj} & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.EMP")Theorem EMP@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{n}{\left [S\right ]}_{ ik}{\left [{D}_{k}\right ]}_{kk}{\left [{S}^{−1}\right ]}_{ kj} & &\text{${\left [{D}_{k}\right ]}_{pℓ} = 0$ if $p\mathrel{≠}k$, or $ℓ\mathrel{≠}k$} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{n}{\left [S\right ]}_{ ik}{λ}_{k}{\left [{S}^{−1}\right ]}_{ kj} & &{\left [{D}_{k}\right ]}_{kk} = {λ}_{k} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{n}{λ}_{ k}{\left [S\right ]}_{ik}{\left [{S}^{−1}\right ]}_{ kj} & &\text{@(a href="fcla-jsmath-2.23li69.html#property.CMCN")Property CMCN@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{n}{λ}_{ k}{\left [{x}_{k}\right ]}_{i1}{\left [{y}_{k}^{t}\right ]}_{ 1j} & &\text{Definition of ${X}^{∗}$, ${Y }^{∗}$} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{n}{λ}_{ k}{ \mathop{∑ }}_{q=1}^{1}{\left [{x}_{ k}\right ]}_{iq}{\left [{y}_{k}^{t}\right ]}_{ qj} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{n}{λ}_{ k}{\left [{x}_{k}{y}_{k}^{t}\right ]}_{ ij} & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.EMP")Theorem EMP@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{n}{\left [{λ}_{ k}{x}_{k}{y}_{k}^{t}\right ]}_{ ij} & &\text{@(a href="fcla-jsmath-2.23li30.html#definition.MSM")Definition MSM@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{n}{\left [{A}_{ k}\right ]}_{ij} & &\text{Definition of ${A}_{k}$} & & & & \cr & ={ \left [{\mathop{∑ }}_{k=1}^{n}{A}_{ k}\right ]}_{ij} & &\text{@(a href="fcla-jsmath-2.23li30.html#definition.MA")Definition MA@(/a)} & & & & \cr & & & & }

So by Definition ME we have the desired equality of matrices. The careful reader will have noted that {A}_{k} = O, r + 1 ≤ k ≤ n, since {λ}_{k} = 0 in these instances. To get the sets X and Y from {X}^{∗} and {Y }^{∗}, simply discard the last n − r vectors. We can safely ignore (or remove) {A}_{r+1},\kern 1.95872pt {A}_{r+2},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {A}_{n} from the summation just derived.

One last assertion to check. What is the rank of {A}_{k}, 1 ≤ k ≤ r? Every row of {A}_{k} is a scalar multiple of {y}_{k}^{t}, row k of the nonsingular matrix {S}^{−1} (Theorem MIMI). As a row of a nonsingular matrix, {y}_{k}^{t} cannot be all zeros. In particular, row i of {A}_{k} is obtained as a scalar multiple of {y}_{k}^{t} by the scalar {α}_{k}{\left [{x}_{k}\right ]}_{i}. We have restricted ourselves to the nonzero eigenvalues of A, and as S is nonsingular, some entry of {x}_{k} is nonzero. This all implies that some row of {A}_{k} will be nonzero. Now consider row-reducing {A}_{k}. Swap the nonzero row up into row 1. Use scalar multiples of this row to zero out every other row. This leaves a single nonzero row in the reduced row-echelon form, so {A}_{k} has rank one. ■

We record two observations that was not stated in our theorem above. First, the vectors in X, chosen as columns of S, are eigenvectors of A. Second, the product of two vectors from X and Y in the opposite order, by which we mean {y}_{i}^{t}{x}_{ j}, is the entry in row i and column j of the matrix product {S}^{−1}S = {I}_{ n} (Theorem EMP). In particular,

\eqalignno{ {y}_{i}^{t}{x}_{ j} & = \left \{\array{ 1\quad &\text{if $i = j$} \cr 0\quad &\text{if $i\mathrel{≠}j$} } \right . & & }

\eqalignno{ A & = \left [\array{ −16&−6\cr 45 & 17 } \right ] & & }

\eqalignno{ {λ}_{1} & = 2 &{ℰ}_{A}\left (2\right ) & = \left \langle \left \{\left [\array{ −1\cr 3 } \right ]\right \}\right \rangle &{λ}_{2} & = −1 &{ℰ}_{A}\left (−1\right ) & = \left \langle \left \{\left [\array{ −2\cr 5 } \right ]\right \}\right \rangle & & & & & & & & }

With n = 2 distinct eigenvalues, Theorem DED tells us that A is diagonalizable, and with no zero eigenvalues we see that A has full rank. Theorem DC says we can construct the nonsingular matrix S with eigenvectors of A as columns, so we have

\eqalignno{ S & = \left [\array{ −1&−2\cr 3 & 5 } \right ] &{S}^{−1} & = \left [\array{ 5 & 2 \cr −3&−1 } \right ] & & & & }

\eqalignno{ X & = \left \{\left [\array{ −1\cr 3 } \right ],\kern 1.95872pt \left [\array{ −2\cr 5 } \right ]\right \} &Y & = \left \{\left [\array{ 5\cr 2 } \right ],\kern 1.95872pt \left [\array{ −3\cr −1 } \right ]\right \} & & & & }

\eqalignno{ {A}_{1} & = 2\left [\array{ −1\cr 3 } \right ]{\left [\array{ 5\cr 2 } \right ]}^{t} = 2\left [\array{ −5&−2 \cr 15& 6 } \right ] = \left [\array{ −10&−4\cr 30 & 12 } \right ] & & \cr {A}_{2} & = (−1)\left [\array{ −2\cr 5 } \right ]{\left [\array{ −3\cr −1 } \right ]}^{t} = (−1)\left [\array{ 6 & 2 \cr −15&−5 } \right ] = \left [\array{ −6&−2\cr 15 & 5 } \right ] & & }

\eqalignno{ B & = \left [\array{ 34 & 18 &−1&−6\cr −44 &−24 &−1 & 9 \cr 36 & 18 &−3&−6\cr 36 & 18 &−6 &−3 } \right ] & & }

\eqalignno{ {λ}_{1} & = 3 &{ℰ}_{B}\left (3\right ) & = \left \langle \left \{\left [\array{ 1\cr −2 \cr 1\cr −1 } \right ],\kern 1.95872pt \left [\array{ 1\cr −1 \cr 1\cr 2 } \right ]\right \}\right \rangle & & & & \cr {λ}_{2} & = −2 &{ℰ}_{B}\left (−2\right ) & = \left \langle \left \{\left [\array{ −1\cr 2 \cr 0\cr 0 } \right ]\right \}\right \rangle & & & & \cr {λ}_{3} & = 0 &{ℰ}_{A}\left (0\right ) & = \left \langle \left \{\left [\array{ 2\cr −3 \cr 2\cr 2 } \right ]\right \}\right \rangle & & & & }

The algebraic and geometric multiplicities of each eigenvalue are equal, so Theorem DMFE tells us that A is diagonalizable. With a single zero eigenvalue we see that A has rank 4 − 1 = 3. Theorem DC says we can construct the nonsingular matrix S with eigenvectors of A as columns, so we have

\eqalignno{ S & = \left [\array{ 1 & 1 &−1& 2\cr −2 &−1 & 2 &−3 \cr 1 & 1 & 0 & 2\cr −1 & 2 & 0 & 2 } \right ] &{S}^{−1} & = \left [\array{ 4 & 2 & 0 &−1\cr 8 & 4 &−1 &−1 \cr −1& 0 & 1 & 0\cr −6 &−3 & 1 & 1 } \right ] & & & & }

\eqalignno{ X & = \left \{\left [\array{ 1\cr −2 \cr 1\cr −1 } \right ],\kern 1.95872pt \left [\array{ 1\cr −1 \cr 1\cr 2 } \right ],\kern 1.95872pt \left [\array{ −1\cr 2 \cr 0\cr 0 } \right ]\right \} &Y & = \left \{\left [\array{ 4\cr 2 \cr 0\cr −1 } \right ],\kern 1.95872pt \left [\array{ 8\cr 4 \cr −1\cr −1 } \right ],\kern 1.95872pt \left [\array{ −1\cr 0 \cr 1\cr 0 } \right ]\right \} & & & & }

\eqalignno{ {B}_{1} & = 3\left [\array{ 1\cr −2 \cr 1\cr −1 } \right ]{\left [\array{ 4\cr 2 \cr 0\cr −1 } \right ]}^{t} = 3\left [\array{ 4 & 2 &0&−1\cr −8 &−4 &0 & 2 \cr 4 & 2 &0&−1\cr −4 &−2 &0 & 1 } \right ] = \left [\array{ 12 & 6 &0&−3\cr −24 &−12 &0 & 6 \cr 12 & 6 &0&−3\cr −12 & −6 &0 & 3 } \right ] & & \cr {B}_{2} & = 3\left [\array{ 1\cr −1 \cr 1\cr 2 } \right ]{\left [\array{ 8\cr 4 \cr −1\cr −1 } \right ]}^{t} = 3\left [\array{ 8 & 4 &−1&−1\cr −8 &−4 & 1 & 1 \cr 8 & 4 &−1&−1\cr 16 & 8 &−2 &−2 } \right ] = \left [\array{ 24 & 12 &−3&−3\cr −24 &−12 & 3 & 3 \cr 24 & 12 &−3&−3\cr 48 & 24 &−6 &−6 } \right ] & & \cr {B}_{3} & = (−2)\left [\array{ −1\cr 2 \cr 0\cr 0 } \right ]{\left [\array{ −1\cr 0 \cr 1\cr 0 } \right ]}^{t} = (−2)\left [\array{ 1 &0&−1&0\cr −2 &0 & 2 &0 \cr 0 &0& 0 &0\cr 0 &0 & 0 &0 } \right ] = \left [\array{ −2&0& 2 &0\cr 4 &0 &−4 &0 \cr 0 &0& 0 &0\cr 0 &0 & 0 &0 } \right ] & & }