next up previous
Next: Inverse Iteration Up: NUMERICAL TECHNIQUES FOR EIGENVALUES Previous: Rayleigh-Ritz, Background:

The QR Method

For reasonably-sized matrices, this is the most efficient and widely used method. There's very good code on NETLIB EISPACK to do this. For large matrices, people are loading into Arnoldi methods (see Lehoucq). This method appeared in 1961 (Francis). Here's an elementary introduction. Given $A$, there's a factorization

\begin{displaymath}
A=QR\quad ,
\end{displaymath}

$R$ is upper triangluar and $Q$ orthogonal. With $A$ real both $Q,R$ are chosen real. The motivation is that we'll construct an iterative technique to find the eigenvalues using similarity transformations. Orthogonal matrices accomplish the transformations in such a way that they will not worsen the condition number or stability of the eigenvalue of a non-symmetric to matrix. A special class of orthogonal matrices is known as ``Householder Matrices'' and we'll concentrate on these.

Note: The term ``orthogonal matrix'' should be restricted to real matrices. However, usage of the term has extended to complex matrices. But in the complex case the matrices should be understood as being ``Unitary.''

Let $W\in {\mathbb{C}}^n$ with $\vert\vert w\vert\vert _2=\sqrt{w^{\star}w}=1$. Define

(86) \begin{displaymath}
U=I-2ww^{\star}
\end{displaymath}

this is the general form of a ``Householder Matrix''

Example) For $n=3$

\begin{displaymath}
w=[w, w_2, w_3]^T \quad \vert w_1\vert^2 +\vert w_2\vert^2+\vert w_3\vert^2>1
\end{displaymath}


\begin{displaymath}
U=\left[\begin{array}{ccc}
1-2\vert w_1\vert^2 & -2w_1 \ove...
...& -2\overline{w}_2 w_3 & 1-\vert w_3\vert^2
\end{array}\right]
\end{displaymath}

$\Box$

\begin{eqnarray*}
U\mbox { is Hermitian: } U^\dagger &=& (I-2ww^{\star})^{\dagge...
...agger}\\
&=& I-(2w^{\star})^{\star}
w^{\star}=I-2ww^{\star}=U
\end{eqnarray*}



$U$ is orthogonal: $U^{\dagger}U=U^2=(I-2ww^{\star})^2=I-4ww^{\star}+4(ww^{\star})
(ww^{\star})=I$ because $w^{\star}w=1$ and the association law applies, then

\begin{displaymath}
(ww^{\star})(ww^{\star})=w(w^\star w)w^{\star}=ww^{\star}
\end{displaymath}

$\Box$

The $QR$ Factorization of a Matrix

\begin{displaymath}
A=QR.
\end{displaymath}

Let $P_r=I-2w^{(r)}w^{(r)T}\quad r=1, \cdots , n-1$

with

(87) $\displaystyle w^{(r)}=[0, 0, \cdots, 0, w_r, \cdots, w_n]^T\equiv [O_{r-1}, \widehat w^T]^T
\mbox {with } \widehat w\in {\mathbb{C}}^{n-r+1}$    

Writing $A$ in terms of its columns $A_{\star 1, }\cdots A_{\star n}$ we have

\begin{displaymath}
P_1 A = [P_1 A_{\star 1}, \ldots, P_1 A_{\star n}].
\end{displaymath}

Pick $P_1$ and $w^{(1)}$ using the following construction: we want to use the Householder matrix to transform a nonzero vector into a new vector containing mainly zeros. Let $b\ne {\bf0}$ be given, $b\in
{\mathbb{R}}^n$ and suppose we want to produce $U$ of the form such that $U {\bf b}$ contains zeros in positions $r+1$ through $n$, for some $r\ge 1$. Choose $w$ as in (88). Then the first $r-1$ elements of ${\bf b}$ and $U {\bf b}$ are the same.

Let's write $m\equiv n-r+1$

(88) $\displaystyle \mbox {so we take } w=\left[\begin{array}{c}
O_{r-1}\\
v
\end{array}\right]
\mbox{ and}$    
(89) $\displaystyle A_{\star 1}=\left[\begin{array}{c}
c\\
d
\end{array}\right]$    

$c\in {\mathbb{R}}^{r-1}$ and $v,d\in {\mathbb{R}}^m$. Then our restriction on the form of $U {\bf b}$ requires the $r-1$ components of $U {\bf b}$ and $c$ be same, and
(90) \begin{displaymath}
(I-2vv^{T})d=[\alpha, 0, 0\ldots 0]^T\quad \vert\vert v\vert\vert _2=1
\end{displaymath}

for some $\alpha$. Since $I-2vv^T$ is orthogonal, the length of $d$ is preserved % latex2html id marker 16562
$\therefore$

\begin{displaymath}
\vert\alpha\vert=\vert\vert d\vert\vert _2\equiv S
\end{displaymath}


(91) \begin{displaymath}
\alpha=\pm S=\pm\sqrt{d^2_1+d^2_2\cdots + d^2_m}
\end{displaymath}

Let

\begin{displaymath}
p=v^Td
\end{displaymath}

from (91)
(92) \begin{displaymath}
d-2pv=[\alpha, 0, \ldots ,0]^T
\end{displaymath}

Multiplication by $v^T$ and use of $\vert\vert v\vert\vert _2=1$ implies
(93) \begin{displaymath}
p=-\alpha v_1
\end{displaymath}

Substituting this into the first component of (93) gives

\begin{eqnarray*}
% latex2html id marker 7634d_1+2\alpha v^2_1=2\\
\therefore v_1=\frac12\left[1-\frac{d_1}{\alpha}\right]
\end{eqnarray*}



Choose sign of $\alpha$ in (92) by

\begin{displaymath}
\mbox {sign }(\alpha)=-\mbox {sign}(d_1)
\end{displaymath}

This choice maximizes $v^2_1$ and avoids loss of significance errors in calculation of $v_1$. Having $v_1$, obtain $p$ from (94). Return to (93) and then using components $2-m$

\begin{displaymath}
v_j=\frac{d_j}{2p}\quad j=2, 3, \cdots , m.
\end{displaymath}

Once $v$ is obtained $\Rightarrow w$ and $U$ are obtained. The operation count can be shown to be $O((2m+2)).$ $\Box$

Now $P_1A$ contains zero below the diagonal in its first column. Choose $P_2$ similarly, so that $P_2P_1A$ contains zeros in its second column below diagonal. Note that $P_1A$ and $P_2P_1A$ have same elements in row 1 and column 1. So to get $P_2$ and $w^{(2)}$ replace $A_{\star 1}$ in (90) by second column of $P_1A$. Continue procedure till we obtain an upper triangular matrix

\begin{displaymath}
R\equiv P_{n-1}P_{n-2}\cdots P_1 A.
\end{displaymath}

If at the $r$-step of procedure all elements below diagonal of column $r$ are zero then choose $P_r=I$ and go to next step. To complete the construction let

\begin{displaymath}
Q^T=P_{n-1}P_{n-2}\cdots P_1
\end{displaymath}

which is orthogonal. Then $A=QR$. $\Box$

Now that we know how to obtain a $QR$ factorization we return to getting the eigenvalues:

Assume $A$ is real $\Rightarrow Q, R$ can be found to be real. Let $A_1 = A$ and define a sequence

\begin{eqnarray*}
A_m &=& Q_m R_m\\
A_{m+1}&=& R_m Q_m\\
m &=& 1, 2, \cdots
\end{eqnarray*}



Since $R_m=Q_m^T A_m\Rightarrow$

\begin{displaymath}
A_{m+1}=Q_m^TA_mQ_m
\end{displaymath}

$A_{m+1}$ is orthogonally similar to $A_m, A_{m-1}\cdots A_1$. The sequence $\{A_m\}$ will converge either to a triangular matrix with the eigenvalues of $A$ on its diagonal or to a near-triangular matrix from which the eigenvalues can easily be calculated. This is slow but a ``shifting'' technique accelerates things considerably (not discussed here).

$\Box$

Remark: $QR$ is very expensive even with shifting. So the first step is to ``prepare'' the matrix by making it simpler. If $A$ is symmetric it can be reduced to a similar symmetric tridiagonal matrix. If not symmetric, it can be reduced to its equivalent ``Hessenberg Matrix'' by unitary similarity transformations. An ``upper Hessenberg Matrix'' by unitary similarity transformations. An ``upper Hessenberg'' matrix has the form:

\begin{displaymath}
H=\left[\begin{array}{ccccccc}
\star & \star & - & - &- &\...
...
\end{array}\right]\mbox {i.e. } h_{ij}=0\mbox { when } 1>j+1.
\end{displaymath}

Note that if $A$ is tridiagonal it is also Hessenberg. However, if $A$ is tridiagonal, an efficient way to calculate its eigenvalue is by using ``Sturm sequences.'' (See Golub and VanLoan)

Convergence of QR: Let $A$ be real $n\times n$ and eigenvalues $\{\lambda_i\}$

(94) \begin{displaymath}
\vert\lambda_1\vert>\vert\lambda_2\vert\cdots \vert\lambda_n\vert>0
\end{displaymath}

then the iterates $R_n$ of $QR$ method will converge to an upper triangular matrix $D$ with $\{\lambda_i\}$ as diagonal elements. If $A$ is symmetric, the sequence $\{A_m\}$ converges to a diagonal matrix. For the rate of convergence

\begin{displaymath}
\vert\vert D-A_m\vert\vert\le c\max_i\vert\frac{\lambda_{i+1}}{\lambda_i}\vert
\end{displaymath}

For matrices with eigenvalues not satisfying (95), the iterates $A_m$ may not converge to a triangular matrix. For $A$ symmetric the sequence $\{A_m\}$ converges to a block diagonal matrix.

\begin{displaymath}
A_m\rightarrow D=
\left[\begin{array}{ccccc}
B_1 & & & 0\\
& B_2& & \\
& & \ddots & \\
0 & & & B_r
\end{array}\right]
\end{displaymath}

in which $B_i$ are simple elements or $2\times 2$ matrices. Thus the eigenvalues of $A$ can be easily computed from those of $D$. For $A$ non-symmetric, the situation is more complicated but still acceptable.

$\Box$


next up previous
Next: Inverse Iteration Up: NUMERICAL TECHNIQUES FOR EIGENVALUES Previous: Rayleigh-Ritz, Background:
Juan Restrepo 2003-04-12