Processing math: 35%

The Power Method

The Power method is an iterative technique to compute the domainant eigenvalue of a matrix A. The eigenvalue λ1 of and n×n matrix A is said to be dominant if

|λ1|>|λ2||λ3||λn|.

Note that not all matrices have a dominant eigenvalue, but we will assume here that the we are dealing with such a matrix. We also assume that

A=SDS1

for an invertible matrix S and a diagonal matrix

D=[λ1λ2λn].

In other words, we assume that A is diagonalizable.

Let x be a non-zero vector in Rn and consider the sequence

x,Ax,A2x,A3x,,Akx,.

Computing these powers:

A=SDS1,A2=SDS1SDS1=SD2S1,A3=SD2S1SDS1=SD3S1,Ak=SDk1S1SDS1=SDkS1.

We write

Akx=λk1S[1(λ2λ1)k(λnλ1)k]S1x.

Because λ1 is the dominant eigenvalue (λj/λ1)k0 as k for all j>1.

For large k we can write

Akxλk1S[100]S1x.

But this will only converge no something non-zero if λ1=1.

So, we take a different approach, and renormalize at with each power of A. Note that Akx grows (or decays) like λk1 and Ak1x, like λk11. So, if we take a ratio of these two, somehow, it should isolate λ1. We use the l norm to do this.

Define x(k)=Akx. Let p=pk1 be the smallest integer such that |x(k1)p|=. Define

\lambda^{(k)} = \frac{x^{(k)}_p}{x^{(k-1)}_p}.

If A is diagonalizable with \lambda_1 being its dominant eigenvalue then (typically) \lambda^{(k)} \to \lambda_1 as k \to \infty.

Define

y = S \begin{bmatrix} 1 \\ & 0 \\ && \ddots \\&&& 0 \end{bmatrix}S^{-1}x.

Then

A^k x = \lambda_1^ky \left( I + O(|\lambda_2/\lambda_1|^k) \right).

We then express

x^{(k)}_{p_{k-1}} = \lambda_1^k y_{p_{k-1}}\left( 1 + O(|\lambda_2/\lambda_1|^k) \right),

and therefore

\lim_{k \to \infty} \frac{x^{(k)}_{p_{k-1}}}{x^{(k)}_{p_{k-1}}} = \lim_{k\to \infty} \frac{\lambda_1^k y_{p_{k-1}}\left( 1 + O(|\lambda_2/\lambda_1|^k) \right)}{\lambda_1^{k-1} y_{p_{k-1}} \left( 1 + O(|\lambda_2/\lambda_1|^{k-1}) \right)} = \lambda_1.

This can only fail if y = 0 (which can happen!). In other words:

Theorem

Let \lambda^{(k)} be the approximation of the dominant eigenvalue \lambda_1 of a diagonalizable matrix using the Power method, then (typically)

\lambda^{(k)} = \lambda_1 + O(|\lambda_2/\lambda_1|^{k}).

In practice, one does not want to just compute A^kx becuase if |\lambda_1| \neq 1, overflow or underflow can affect the computation. We do the following:

  1. Generate a random vector x, set x = x/\|x\|_\infty.
  2. Let p be the smallest integer such that |x_p| = \|x\|_\infty.
  3. For k =1,2,\ldots,N_{\mathrm{max}} do
    1. Set y = Ax.
    2. Set \lambda^{(k)} = y_p/x_p.
    3. If k > 1 and |\lambda^{(k)} - \lambda^{(k-1)}| < \epsilon, return \lambda^{(k)}.
    4. Set x = y/\|y\|_\infty.
    5. Let p be the smallest integer such that |x_p| = \|x\|_\infty.

The symmetric Power method

When the matrix A is symmetric, the error term can be reduced further with the help of the fact that \|Ax\|^2_2 = x^T A^T A x = x A^2 x. For the symmetric Power method we perform the following:

  1. Generate a random vector x, set x = x/\|x\|_2.
  2. For k =1,2,\ldots,N_{\mathrm{max}} do
    1. Set y = Ax.
    2. Set \displaystyle \lambda^{(k)} = {x^Ty}.
    3. If |\lambda^{(k)} - \lambda^{(k-1)}| < \epsilon, return \lambda^{(k)}.
    4. Set x = y/\|y\|_2.

To analyze the error for the symmetric Power method, we define x^{(k)} to be the value of x that is computed at iteration k, with x^{(0)} being the initial unit vector computed in step 1. By induction it follows that x^{(k)} = c A^{k} x^{(0)} where c is chosen sothat \|x^{(k)}\|_2 = 1, or c = \|A^{k}x^{(0)}\|_2^{-1}. It then follows that

\lambda^{(k+1)} = (x^{(k)})^T (A x^{(k)}) = \frac{ (x^{(0)})^T (A^{k})^T A^{k+1} x^{(0)} }{(x^{(0)})^T (A^{k})^T A^{k} x^{(0)}} = \frac{ (x^{(0)})^T A^{2k+1} x^{(0)} }{(x^{(0)})^T A^{2k} x^{(0)}} \quad k \geq 0

Because A is symmetric we can apply the Spectral Theorem: A = Q^T D Q for an orthogonal matrix Q and a diagonal matrix D that contains the eigenvalues. Let c = Q x^{(0)} and we have

\lambda^{(k+1)} = \frac{c^T D^{2k+1} c}{c^T D^{2k} c} = \frac{\displaystyle \sum_{j=1}^n c_j^2 \lambda_j^{2k+1}}{\displaystyle \sum_{j=1}^n c_j^2 \lambda_j^{2k}}.

Dividing numerator and denominator by \lambda_1^{2k} where \lambda_1 is assumed to be the dominant eigenvalue, we have

\lambda^{(k+1)} = \lambda_1 \frac{\displaystyle \sum_{j=1}^n c_j^2 (\lambda_j/\lambda_1)^{2k+1}}{\displaystyle \sum_{j=1}^n c_j^2 (\lambda_j/\lambda_1)^{2k}} = \lambda_1 \frac{\displaystyle 1 + \sum_{j=2}^n c_j^2 (\lambda_j/\lambda_1)^{2k+1}}{\displaystyle 1+ \sum_{j=2}^n c_j^2 (\lambda_j/\lambda_1)^{2k}} = \lambda_1 \frac{1 + O(|\lambda_j/\lambda_1|^{2k+1})}{1 +O(|\lambda_j/\lambda_1|^{2k})}.

From this it follows that:

Theorem

Let \lambda^{(k)} be the approximation of the dominant eigenvalue \lambda_1 of a symmetric matrix using the symmetric Power method, then (typically)

\lambda^{(k)} = \lambda_1 + O(|\lambda_2/\lambda_1|^{2k}).