Section SR Square Roots

With all our results about Hermitian matrices, their eigenvalues and their diagonalizations, it will be a nearly trivial matter to now construct a “square root” of a positive semi-definite matrix. We will describe the square root of a matrix A as a matrix S such that A=S2. In general, a matrix A might have many such square roots. But with a few results in hand we will be able to impose an extra condition on S that will make a unique S such that A=S2. At that point we can define the square root of A formally.

Subsection SRM: Square Root of a Matrix

Theorem PSMSR
Positive Semi-Definite Matrices and Square Roots
Suppose A is a square matrix. There is a positive semi-definite matrix S such that A=S2 if and only if A is positive semi-definite. □

(⇐) Suppose that A is positive semi-definite. Since A is Hermitian (Definition PSM) we know A is normal (Definition NRML) and so by Theorem OD there is a unitary matrix U and a diagonal matrix D, whose diagonal entries are the eigenvalues of A, such that D=U∗AU. The eigenvalues of A are all non-negative (Theorem EPSM), which allows us to define a diagonal matrix E whose diagonal entries are the positive square roots of the eigenvalues of A, in the same order as they appear in D. More precisely, define E to be the diagonal matrix with non-negative diagonal entries such that {E}^{2} = D. Set S = UE{U}^{∗}, and compute

\eqalignno{ {S}^{2} & = UE{U}^{∗}UE{U}^{∗} & & & & \cr & = UE{I}_{n}E{U}^{∗} & &\text{@(a href="fcla-jsmath-2.23li33.html#definition.UM")Definition UM@(/a)} & & & & \cr & = UEE{U}^{∗} & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMIM")Theorem MMIM@(/a)} & & & & \cr & = UD{U}^{∗} & & & & \cr & = U{U}^{∗}AU{U}^{∗} & &\text{@(a href="fcla-jsmath-2.23li59.html#theorem.OD")Theorem OD@(/a)} & & & & \cr & = {I}_{n}A{I}_{n} & &\text{@(a href="fcla-jsmath-2.23li33.html#definition.UM")Definition UM@(/a)} & & & & \cr & = A & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMIM")Theorem MMIM@(/a)} & & & & }

\eqalignno{ {S}^{∗} & ={ \left (UE{U}^{∗}\right )}^{∗} & & & & \cr & ={ \left (UE{U}^{∗}\right )}^{∗} & & & & \cr & ={ \left ({U}^{∗}\right )}^{∗}{E}^{∗}{U}^{∗} & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMAD")Theorem MMAD@(/a)} & & & & \cr & = U{E}^{∗}{U}^{∗} & &\text{@(a href="fcla-jsmath-2.23li30.html#theorem.AA")Theorem AA@(/a)} & & & & \cr & = U{\left (\overline{E}\right )}^{t}{U}^{∗} & &\text{@(a href="fcla-jsmath-2.23li30.html#definition.A")Definition A@(/a)} & & & & \cr & = U{E}^{t}{U}^{∗} & &\text{@(a href="fcla-jsmath-2.23li48.html#theorem.HMRE")Theorem HMRE@(/a)} & & & & \cr & = UE{U}^{∗} & &\text{Diagonal matrix} & & & & \cr & = S & & & & }

And finally, we want to check the use of S in an inner product. Notice that E is Hermitian since it is a diagonal matrix with real entries. Furthermore, as a diagonal matrix, the eigenvalues of E are precisely the diagonal entries, and since these were chosen to be positive, an application of Theorem EPSM tells us that E is positive semi-definite. Now, for any x ∈ {ℂ}^{n},

\eqalignno{ \left \langle Sx,\kern 1.95872pt x\right \rangle & = \left \langle UE{U}^{∗}x,\kern 1.95872pt x\right \rangle & & & & \cr & = \left \langle E{U}^{∗}x,\kern 1.95872pt {U}^{∗}x\right \rangle & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.AIP")Theorem AIP@(/a)} & & & & \cr & = \left \langle E\left ({U}^{∗}x\right ),\kern 1.95872pt {U}^{∗}x\right \rangle & & & & \cr & ≥ 0 & &\text{@(a href="fcla-jsmath-2.23li103.html#definition.PSM")Definition PSM@(/a)} & & & & }

( ⇒) Assume that A = {S}^{2}, with S positive semi-definite. Then S is Hermitian, and we check that A is Hermitian.

\eqalignno{ {A}^{∗} & ={ \left (SS\right )}^{∗} & & & & \cr & = {S}^{∗}{S}^{∗} & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMAD")Theorem MMAD@(/a)} & & & & \cr & = SS & &\text{@(a href="fcla-jsmath-2.23li31.html#definition.HM")Definition HM@(/a)} & & & & \cr & = A & & & & }

\eqalignno{ \left \langle Ax,\kern 1.95872pt x\right \rangle & = \left \langle {S}^{2}x,\kern 1.95872pt x\right \rangle & & & & \cr & = \left \langle Sx,\kern 1.95872pt {S}^{∗}x\right \rangle & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.AIP")Theorem AIP@(/a)} & & & & \cr & = \left \langle Sx,\kern 1.95872pt Sx\right \rangle & &\text{@(a href="fcla-jsmath-2.23li31.html#definition.HM")Definition HM@(/a)} & & & & \cr & ≥ 0 & &\text{@(a href="fcla-jsmath-2.23li28.html#theorem.PIP")Theorem PIP@(/a)} & & & & }

There is a very close relationship between the eigenvalues and eigenspaces of a positive semi-definite matrix and its positive semi-definite square root. The next theorem is interesting in its own right, but is also an important technical step in some other important results, such as the upcoming uniqueness of the square root (Theorem USR).

Theorem EESR
Eigenvalues and Eigenspaces of a Square Root
Suppose that A is a positive semi-definite matrix and S is a positive semi-definite matrix such that A = {S}^{2}. If {λ}_{1},\kern 1.95872pt {λ}_{2},\kern 1.95872pt {λ}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {λ}_{p} are the distinct eigenvalues of A, then the distinct eigenvalues of S are \sqrt{{λ}_{1}},\kern 1.95872pt \sqrt{{λ}_{2}},\kern 1.95872pt \sqrt{{λ}_{3}},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt \sqrt{{λ}_{p}}, and {ℰ}_{S}\left (\sqrt{{λ}_{i}}\right ) = {ℰ}_{A}\left ({λ}_{i}\right ) for 1 ≤ i ≤ p. □

Proof Let x be an eigenvector of S for an eigenvalue ρ. Then, in the style of Theorem EPM,

\eqalignno{ Ax & = {S}^{2}x = S\left (Sx\right ) = S\left (ρx\right ) = ρSx = {ρ}^{2}x & & }

so {ρ}^{2} is an eigenvalue of A and must equal some {λ}_{i}. Furthermore, because S is positive semi-definite, Theorem EPSM tells us that ρ ≥ 0. The impact for us here is that we cannot have two different eigenvalues of S whose squares equal the same eigenvalue of A, so we can pair each eigenvalue of S with a different eigenvalue of A, equal to its square. (A good exercise is to track through the rest of this proof in the situation where S is not assumed to be positive semi-definite and we do not have this condition on the eigenvalues. Where does the proof then break down?) Let {ρ}_{i}, 1 ≤ i ≤ q denote the q distinct eigenvalues of S. The discussion above implies that we can order the eigenvalues of A and S so that {λ}_{i} = {ρ}_{i}^{2} for 1 ≤ i ≤ q. Notice that at this point we know that q ≤ p, though we will be showing that q = p.

Additionally, the equation above tells us that every eigenvector of S for {ρ}_{i} is again an eigenvector of A for {ρ}_{i}^{2}. So for 1 ≤ i ≤ q, the relevant eigenspaces are related by

\eqalignno{ {ℰ}_{S}\left (\sqrt{{λ}_{i}}\right ) = {ℰ}_{S}\left ({ρ}_{i}\right ) ⊆{ℰ}_{A}\left ({ρ}_{i}^{2}\right ) = {ℰ}_{ A}\left ({λ}_{i}\right ) & & }

So the eigenspaces of S are subsets of the eigenspaces of A, for the related eigenvalues. However, we will be showing that these sets are indeed equal to each other.

Both A and S are positive semi-definite, hence Hermitian and therefore normal. Theorem OD then tells us that each is diagonalizable (Definition DZM). Then Theorem DMFE says that the algebraic multiplicity and geometric multiplicity of each eigenvalue are equal. Then, if we let n denote the size of A,

\eqalignno{ n & ={ \mathop{∑ }}_{i=1}^{q}{α}_{ S}\left (\sqrt{{λ}_{i}}\right ) & &\text{@(a href="fcla-jsmath-2.23li48.html#theorem.NEM")Theorem NEM@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{q}{γ}_{ S}\left (\sqrt{{λ}_{i}}\right ) & &\text{@(a href="fcla-jsmath-2.23li49.html#theorem.DMFE")Theorem DMFE@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{q}\mathop{ dim}\nolimits \left ({ℰ}_{ S}\left (\sqrt{{λ}_{i}}\right )\right ) & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.GME")Definition GME@(/a)} & & & & \cr & ≤{\mathop{∑ }}_{i=1}^{q}\mathop{ dim}\nolimits \left ({ℰ}_{ A}\left ({λ}_{i}\right )\right ) & &\text{@(a href="fcla-jsmath-2.23li42.html#theorem.PSSD")Theorem PSSD@(/a)} & & & & \cr & ≤{\mathop{∑ }}_{i=1}^{p}\mathop{ dim}\nolimits \left ({ℰ}_{ A}\left ({λ}_{i}\right )\right ) & &\text{@(a href="fcla-jsmath-2.23li41.html#definition.D")Definition D@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{p}{γ}_{ A}\left ({λ}_{i}\right ) & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.GME")Definition GME@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{p}{α}_{ A}\left ({λ}_{i}\right ) & &\text{@(a href="fcla-jsmath-2.23li49.html#theorem.DMFE")Theorem DMFE@(/a)} & & & & \cr & = n & &\text{@(a href="fcla-jsmath-2.23li48.html#theorem.NEM")Theorem NEM@(/a)} & & & & }

With equal values at the two ends of this chain of equalities and inequalities, we know that the two inequalities are forced to actually be equalities. In particular, the second inequality implies that p = q and the first, in conjunction with Theorem EDYES, implies that {ℰ}_{S}\left (\sqrt{{λ}_{i}}\right ) = {ℰ}_{A}\left ({λ}_{i}\right ) for 1 ≤ i ≤ p. ■

Notice that we defined the singular values of a matrix A as the square roots of the eigenvalues of {A}^{∗}A (Definition SV). With Theorem EESR in hand we recognize the singular values of A as simply the eigenvalues of {A}^{∗}{A}^{1∕2}. Indeed, many authors take this as the definition of singular values, since it is equivalent to our definition. We have chosen not to wait for a discussion of square roots before making a definition of singular values, allowing us to present the singular value decomposition (Theorem SVD) all the sooner.

In the first half of the proof of Theorem PSMSR we could have chosen the matrix E (which was the essential component of the desired matrix S) in a variety of ways. Any collection of diagonal entries of E could be replaced by their negatives and we would maintain the property that {E}^{2} = D. However, if we decide to enforce the entries of E as non-negative quantities then E is positive semi-definite, and then S follows along as a positive semi-definite matrix. We now show that of all the possible square roots of a positive semi-definite matrix, only one is itself again positive semi-definite. In other words, the S of Theorem PSMSR is unique.

Theorem USR
Unique Square Root
Suppose A is a positive semi-definite matrix. Then there is a unique positive semi-definite matrix S such that A = {S}^{2}. □

Proof Theorem PSMSR gives us the existence of at least one positive semi-definite matrix S such that A = {S}^{2}. As usual, we will assume that {S}_{1} and {S}_{2} are positive semi-definite matrices such that A = {S}_{1}^{2} = {S}_{ 2}^{2} (Technique U).

As A is diagonalizable, there is a basis of {ℂ}^{n} composed entirely of eigenvectors of A (Theorem DC), say B = \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{n}\right \}. Let {δ}_{1},\kern 1.95872pt {δ}_{2},\kern 1.95872pt {δ}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {δ}_{n} denote the associated eigenvalues. Theorem EESR allows to conclude that {ℰ}_{A}\left ({δ}_{i}\right ) = {ℰ}_{{S}_{1}}\left (\sqrt{{δ}_{i}}\right ) = {ℰ}_{{S}_{2}}\left (\sqrt{{δ}_{i}}\right ). So {S}_{1}{x}_{i} = \sqrt{{δ}_{i}}{x}_{i} = {S}_{2}{x}_{i} for 1 ≤ i ≤ n.

Choose any x ∈ {ℂ}^{n}. The spanning property of B allows us to conclude the existence of a set of scalars, {a}_{1},\kern 1.95872pt {a}_{2},\kern 1.95872pt {a}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {a}_{n}, yielding x as a linear combination of the vectors in B. So,

\eqalignno{ {S}_{1}x = {S}_{1}{ \mathop{∑ }}_{i=1}^{n}{a}_{ i}{x}_{i} ={ \mathop{∑ }}_{i=1}^{n}{a}_{ i}{S}_{1}{x}_{i} ={ \mathop{∑ }}_{i=1}^{n}{a}_{ i}\sqrt{{δ}_{i}}{x}_{i} ={ \mathop{∑ }}_{i=1}^{n}{a}_{ i}{S}_{2}{x}_{i} = {S}_{2}{ \mathop{∑ }}_{i=1}^{n}{a}_{ i}{x}_{i} = {S}_{2}x&& }

Since {S}_{1} and {S}_{2} have the same action on every vector, Theorem EMMVP yields the conclusion that {S}_{1} = {S}_{2}. ■

With a criteria that distinguishes one square root from all the rest (positive semi-definiteness) we can now define the square root of a positive semi-definite matrix.

Definition SRM
Square Root of a Matrix
Suppose A is a positive semi-definite matrix and S is the positive semi-definite matrix such that {S}^{2} = SS = A. Then S is the square root of A and we write S = {A}^{1∕2}.