The Power Method¶

The Power method is an iterative technique to compute the domainant eigenvalue of a matrix $A$ . The eigenvalue $\lambda_1$ of and $n \times n$ matrix $A$ is said to be dominant if

$|\lambda_1| > |\lambda_2| \geq |\lambda_3| \geq \cdots \geq |\lambda_n|.$

Note that not all matrices have a dominant eigenvalue, but we will assume here that the we are dealing with such a matrix. We also assume that

$A = S D S^{-1}$

for an invertible matrix $S$ and a diagonal matrix

$D = \begin{bmatrix} \lambda_1 \\ & \lambda_2 \\ && \ddots \\&&& \lambda_n \end{bmatrix}.$

In other words, we assume that A is diagonalizable.

Let $x$ be a non-zero vector in $\mathbb R^n$ and consider the sequence

$x, Ax, A^2x, A^3x,\ldots, A^kx, \ldots.$

Computing these powers:

$A = S DS^{-1},$

$A^2 = S DS^{-1}S DS^{-1} = S D^2S^{-1},$

$A^3 = S D^2S^{-1}S DS^{-1} = S D^3S^{-1},$

$\vdots$

$A^k = S D^{k-1}S^{-1}S DS^{-1} = S D^kS^{-1}.$

We write

$A^k x = \lambda_1^k S \begin{bmatrix} 1 \\ & \left(\frac{\lambda_2}{\lambda_1}\right)^k \\ && \ddots \\&&& \left(\frac{\lambda_n}{\lambda_1}\right)^k \end{bmatrix}S^{-1}x.$

Because $\lambda_1$ is the dominant eigenvalue $(\lambda_j/\lambda_1)^k \to 0$ as $k \to \infty$ for all $j > 1$ .

For large $k$ we can write

$A^k x \approx \lambda_1^k S \begin{bmatrix} 1 \\ & 0 \\ && \ddots \\&&& 0 \end{bmatrix}S^{-1}x.$

But this will only converge no something non-zero if $\lambda_1 = 1$ .

So, we take a different approach, and renormalize at with each power of $A$ . Note that $A^kx$ grows (or decays) like $\lambda_1^k$ and $A^{k-1}x$ , like $\lambda_1^{k-1}$ . So, if we take a ratio of these two, somehow, it should isolate $\lambda_1$ . We use the $l_\infty$ norm to do this.

Define $x^{(k)} = A^k x$ . Let $p = p_{k-1}$ be the smallest integer such that $|x^{(k-1)}_p| = \|x^{(k-1)}\|_\infty$ . Define

$\lambda^{(k)} = \frac{x^{(k)}_p}{x^{(k-1)}_p}.$

If $A$ is diagonalizable with $\lambda_1$ being its dominant eigenvalue then (typically) $\lambda^{(k)} \to \lambda_1$ as $k \to \infty$ .

Define

$y = S \begin{bmatrix} 1 \\ & 0 \\ && \ddots \\&&& 0 \end{bmatrix}S^{-1}x.$

Then

$A^k x = \lambda_1^ky \left( I + O(|\lambda_2/\lambda_1|^k) \right).$

We then express

$x^{(k)}_{p_{k-1}} = \lambda_1^k y_{p_{k-1}}\left( 1 + O(|\lambda_2/\lambda_1|^k) \right),$

and therefore

$\lim_{k \to \infty} \frac{x^{(k)}_{p_{k-1}}}{x^{(k)}_{p_{k-1}}} = \lim_{k\to \infty} \frac{\lambda_1^k y_{p_{k-1}}\left( 1 + O(|\lambda_2/\lambda_1|^k) \right)}{\lambda_1^{k-1} y_{p_{k-1}} \left( 1 + O(|\lambda_2/\lambda_1|^{k-1}) \right)} = \lambda_1.$

This can only fail if $y = 0$ (which can happen!). In other words:

Theorem¶

Let $\lambda^{(k)}$ be the approximation of the dominant eigenvalue $\lambda_1$ of a diagonalizable matrix using the Power method, then (typically)

$\lambda^{(k)} = \lambda_1 + O(|\lambda_2/\lambda_1|^{k}).$

In practice, one does not want to just compute $A^kx$ becuase if $|\lambda_1| \neq 1$ , overflow or underflow can affect the computation. We do the following:

Generate a random vector $x$ , set $x = x/\|x\|_\infty$ .
Let $p$ be the smallest integer such that $|x_p| = \|x\|_\infty$ .
For k =1,2,\ldots,N_{\mathrm{max}} do
1. Set $y = Ax$ .
2. Set $\lambda^{(k)} = y_p/x_p$ .
3. If $k > 1$ and $|\lambda^{(k)} - \lambda^{(k-1)}| < \epsilon$ , return $\lambda^{(k)}$ .
4. Set $x = y/\|y\|_\infty$ .
5. Let $p$ be the smallest integer such that $|x_p| = \|x\|_\infty$ .

The symmetric Power method¶

When the matrix $A$ is symmetric, the error term can be reduced further with the help of the fact that $\|Ax\|^2_2 = x^T A^T A x = x A^2 x$ . For the symmetric Power method we perform the following:

Generate a random vector $x$ , set $x = x/\|x\|_2$ .
For k =1,2,\ldots,N_{\mathrm{max}} do
1. Set $y = Ax$ .
2. Set $\displaystyle \lambda^{(k)} = {x^Ty}$ .
3. If $|\lambda^{(k)} - \lambda^{(k-1)}| < \epsilon$ , return $\lambda^{(k)}$ .
4. Set $x = y/\|y\|_2$ .

To analyze the error for the symmetric Power method, we define $x^{(k)}$ to be the value of $x$ that is computed at iteration $k$ , with $x^{(0)}$ being the initial unit vector computed in step 1. By induction it follows that $x^{(k)} = c A^{k} x^{(0)}$ where $c$ is chosen sothat $\|x^{(k)}\|_2 = 1$ , or $c = \|A^{k}x^{(0)}\|_2^{-1}$ . It then follows that

$\lambda^{(k+1)} = (x^{(k)})^T (A x^{(k)}) = \frac{ (x^{(0)})^T (A^{k})^T A^{k+1} x^{(0)} }{(x^{(0)})^T (A^{k})^T A^{k} x^{(0)}} = \frac{ (x^{(0)})^T A^{2k+1} x^{(0)} }{(x^{(0)})^T A^{2k} x^{(0)}} \quad k \geq 0$

Because $A$ is symmetric we can apply the Spectral Theorem: $A = Q^T D Q$ for an orthogonal matrix $Q$ and a diagonal matrix $D$ that contains the eigenvalues. Let $c = Q x^{(0)}$ and we have

$\lambda^{(k+1)} = \frac{c^T D^{2k+1} c}{c^T D^{2k} c} = \frac{\displaystyle \sum_{j=1}^n c_j^2 \lambda_j^{2k+1}}{\displaystyle \sum_{j=1}^n c_j^2 \lambda_j^{2k}}.$

Dividing numerator and denominator by $\lambda_1^{2k}$ where $\lambda_1$ is assumed to be the dominant eigenvalue, we have

$\lambda^{(k+1)} = \lambda_1 \frac{\displaystyle \sum_{j=1}^n c_j^2 (\lambda_j/\lambda_1)^{2k+1}}{\displaystyle \sum_{j=1}^n c_j^2 (\lambda_j/\lambda_1)^{2k}} = \lambda_1 \frac{\displaystyle 1 + \sum_{j=2}^n c_j^2 (\lambda_j/\lambda_1)^{2k+1}}{\displaystyle 1+ \sum_{j=2}^n c_j^2 (\lambda_j/\lambda_1)^{2k}} = \lambda_1 \frac{1 + O(|\lambda_j/\lambda_1|^{2k+1})}{1 +O(|\lambda_j/\lambda_1|^{2k})}.$

From this it follows that:

Theorem¶

Let $\lambda^{(k)}$ be the approximation of the dominant eigenvalue $\lambda_1$ of a symmetric matrix using the symmetric Power method, then (typically)

$\lambda^{(k)} = \lambda_1 + O(|\lambda_2/\lambda_1|^{2k}).$