Section 1.5 Reflectors
ΒΆWhen we decompose matrices into products of other matrices, we often wish to create matrices with many zero entries. A Householder matrix is a unitary matrix which transforms a vector into a vector of the same size that is a nonzero multiple of the first column of an identity matrix, thus creating a vector with just a single nonzero entry. A typical application is to βzero outβ entries of a matrix below the diagonal, column-by-column, in order to achieve a triangular matrix.
Definition 1.5.1.
Given a nonzero vector vβCn, the Householder matrix for v is
P=Inβ(2β¨v,vβ©)vvβ.
The vector v is called the Householder vector.
A Householder matrix is both Hermitian and unitary.
Lemma 1.5.2.
The Householder matrix for vβCn is Hermitian.
Proof.
\begin{align*}
\adjoint{P}&=\adjoint{\left(I_n-\left(\frac{2}{\innerproduct{\vect{v}}{\vect{v}}}\right)\vect{v}\adjoint{\vect{v}}\right)}\\
&=\adjoint{I_n}-\left(\frac{2}{\innerproduct{\vect{v}}{\vect{v}}}\right)\adjoint{\left(\vect{v}\adjoint{\vect{v}}\right)}\\
&=I_n- \left(\frac{2}{\innerproduct{\vect{v}}{\vect{v}}}\right) \adjoint{\left(\adjoint{\vect{v}}\right)}\adjoint{\vect{v}}\\
&=I_n-\left(\frac{2}{\innerproduct{\vect{v}}{\vect{v}}}\right)\vect{v}\adjoint{\vect{v}}\\
&=P
\end{align*}
Lemma 1.5.3.
The Householder matrix for vβCn is unitary.
Proof.
\begin{align*}
\adjoint{P}P&=PP\\
&=\left(I_n-\left(\frac{2}{\innerproduct{\vect{v}}{\vect{v}}}\right)\vect{v}\adjoint{\vect{v}}\right)\left(I_n-\left(\frac{2}{\innerproduct{\vect{v}}{\vect{v}}}\right)\vect{v}\adjoint{\vect{v}}\right)\\
&=I_n^2
-\left(\frac{2}{\innerproduct{\vect{v}}{\vect{v}}}\right)\vect{v}\adjoint{\vect{v}}
-\left(\frac{2}{\innerproduct{\vect{v}}{\vect{v}}}\right)\vect{v}\adjoint{\vect{v}}
+\left(\frac{4}{\innerproduct{\vect{v}}{\vect{v}}^2}\right)\vect{v}\adjoint{\vect{v}}\vect{v}\adjoint{\vect{v}}\\
&=I_n
-\left(\frac{4}{\innerproduct{\vect{v}}{\vect{v}}}\right)\vect{v}\adjoint{\vect{v}}
+\left(\frac{4}{\innerproduct{\vect{v}}{\vect{v}}^2}\right)\vect{v}\innerproduct{\vect{v}}{\vect{v}}\adjoint{\vect{v}}\\
&=I_n
-\left(\frac{4}{\innerproduct{\vect{v}}{\vect{v}}}\right)\vect{v}\adjoint{\vect{v}}
+\left(\frac{4}{\innerproduct{\vect{v}}{\vect{v}}}\right)\vect{v}\adjoint{\vect{v}}\\
&=I_n
\end{align*}
Our aim with a Householder matrix is to convert a vector x into a scalar multiple of the first column of the identity matrix, e1. Which Householder vector should we choose for constructing such a matrix, and which multiple will we get? It is an instructive exercise to reverse-engineer the choice by setting Px=Ξ±e1 (Exercise Checkpoint 1.5.9). Instead, we give the answer and prove that it does the desired job.
Theorem 1.5.4.
Given a vector xβRn, define v=xΒ±βxβe1 and let P be the Householder matrix for the Householder vector v. Then Px=ββxβe1.
Proof.
We first establish an unmotivated identity.
\begin{align*}
\innerproduct{\vect{v}}{\vect{v}}
&=\adjoint{\left(\vect{x}\pm\norm{\vect{x}}\vect{e}_1\right)}\left(\vect{x}\pm\norm{\vect{x}}\vect{e}_1\right)\\
&=\adjoint{\vect{x}}\vect{x}
\pm\adjoint{\vect{x}}\norm{\vect{x}}\vect{e}_1
\pm\adjoint{\left(\norm{\vect{x}}\vect{e}_1\right)}\vect{x}
+\adjoint{\left(\norm{\vect{x}}\vect{e}_1\right)}\left(\norm{\vect{x}}\vect{e}_1\right)\\
&=\adjoint{\vect{x}}\vect{x}
\pm\norm{\vect{x}}\adjoint{\vect{e}_1}\vect{x}
\pm\norm{\vect{x}}\adjoint{\vect{e}_1}\vect{x}
+\adjoint{\vect{x}}\vect{x}\adjoint{\vect{e}_1}\vect{e}_1\\
&=2\left(\adjoint{\vect{x}}\vect{x}\pm\norm{\vect{x}}\adjoint{\vect{e}_1}\vect{x}\right)\\
&=2\left(\adjoint{\vect{x}}\pm\norm{\vect{x}}\adjoint{\vect{e}_1}\right)\vect{x}\\
&=2\adjoint{\left(\vect{x}\pm\norm{\vect{x}}\vect{e}_1\right)}\vect{x}\\
&=2\adjoint{\vect{v}}\vect{x}
\end{align*}
Then
\begin{align*}
P\vect{x}
&=\left(I_n-\left(\frac{2}{\innerproduct{\vect{v}}{\vect{v}}}\right)\vect{v}\adjoint{\vect{v}}\right)\vect{x}\\
&=I_n\vect{x}-\left(\frac{2}{\innerproduct{\vect{v}}{\vect{v}}}\right)\vect{v}\adjoint{\vect{v}}\vect{x}\\
&=\vect{x}-\left(\frac{2}{2\adjoint{\vect{v}}\vect{x}}\right)\vect{v}\adjoint{\vect{v}}\vect{x}\\
&=\vect{x}-\vect{v}\\
&=\vect{x}-\left(\vect{x}\pm\norm{\vect{x}}\vect{e}_1\right)\\
&=\mp\norm{\vect{x}}\vect{e}_1
\end{align*}
Example 1.5.5.
Consider the vector \(\vect{x}\) and construct the Householder vector \(\vect{v}\text{.}\)
\begin{align*}
\vect{x}&=\colvector{4\\4\\7}&\vect{v}&=\vect{x}-9\vect{e}_1=\colvector{-5\\4\\7}
\end{align*}
Then the Householder matrix for \(\vect{v}\) is
\begin{equation*}
P = \begin{bmatrix}
\frac{4}{9} & \frac{4}{9} & \frac{7}{9} \\
\frac{4}{9} & \frac{29}{45} & -\frac{28}{45} \\
\frac{7}{9} & -\frac{28}{45} & -\frac{4}{45}
\end{bmatrix}
\end{equation*}
We can check that the matrix behaves as we expect.
\begin{equation*}
P\vect{x}=\colvector{9\\0\\0}
\end{equation*}
A Householder matrix is often called a Householder reflection. Why? Any Householder matrix, when thought of as a mapping of points in a physical space, will fix the elements of a hyperplane and reflect any other points about that hyperplane. To see this, consider any vector w and compare it with its image, Pw
Pwβw=Inwβ(2β¨v,vβ©)vvβwβw=β2β¨v,vβ©vβ¨v,wβ©=β2β¨v,wβ©β¨v,vβ©v
So this difference is always a scalar multiple of the Householder vector v, and thus every point gets moved in the same direction, the direction described by v. Which points are fixed by P? The difference above is the zero vector precisely when the scalar multiple is zero, which will happen when β¨v,wβ©=0. So the set of points/vectors which are orthogonal to v will be unmoved. This is a subspace of dimension one less than the full space, which are typically described by the term hyperplanes.
To be more specific, consider the specific situation of Example Example 1.5.5, viewed in R3. The hyperplane is the subspace orthogonal to v, or the two-dimensional plane with v as its normal vector, and equation β5x+4y+7z=0. The points (4,4,7) and (9,0,0) lie on either side of the plane and are a reflection of each other in the plane, by which we mean the vector (4,4,7)β(9,0,0)=(β5,4,7) is perpendicular (orthogonal, normal) to the plane.
Our choice of v can be replaced by v=x+βxβe1, so in the previous example we would have v=[1347], and then P would take x to [β900]. This would be a reflection across the plane with equation 13x+4y+7z=0. Notice that the normal vector of this plane is orthogonal to the normal vector of the previous plane, which is not an accident (Exercise Checkpoint 1.5.6).
As a practical matter, we would choose the Householder vector which moves x the furthest, to get better numerical behavior. So in our example above, the second choice would be better, since x will be moved a distance 2βvβ and the second v has a larger norm.
Checkpoint 1.5.6.
In the real case, we have two choices for a Householder vector which will βzero outβ most of a vector. Show that these two vectors, \(\vect{x}+\norm{\vect{x}}\vect{e}_1\) and \(\vect{x}-\norm{\vect{x}}\vect{e}_1\text{,}\) are orthogonal to each other.
Checkpoint 1.5.7.
Prove the following generalization of Theorem Theorem 1.5.4. Given a vector \(\vect{x}\in\complex{n}\text{,}\) define \(\rho=\vectorentry{\vect{x}}{1}/\modulus{\vectorentry{\vect{x}}{1}}\) and \(\vect{v}=\vect{x}\pm\rho\norm{\vect{x}}\vect{e}_1\) and let \(P\) be the Householder matrix for the Householder vector \(\vect{v}\text{.}\) Then \(P\vect{x}=\mp\rho\norm{\vect{x}}\vect{e}_1\text{.}\)
HintYou can establish the same identity as in the first part of the proof of Theorem Theorem 1.5.4.
Checkpoint 1.5.8.
Suppose that \(P\) is a Householder matrix of size \(n\) and \(\vect{b}\in\complex{n}\) is any vector. Find an expression for the matrix-vector product \(P\vect{b}\) which will suggest a way to compute this vector with fewer than the \(\orderof{2n^2}\) operations required for a general matrix-vector product.
Checkpoint 1.5.9.
Begin with the condition that a Householder matrix will accomplish \(P\vect{x}=\alpha\vect{e}_1\) and βdiscoverβ the choice for the Householder vector described in Theorem Theorem 1.5.4. Hint: The condition implies that the Householder vector \(\vect{v}\) is in the span of \(\set{\vect{x}, \vect{e}_1}\text{.}\)