Convergence of iterative methods¶

Theorem¶

The fixed point method given by

$g(x) = Tx + b, \quad x^{(k)} = g(x^{(k-1)}), \quad x^{(0)} \in \mathbb R^{n}, k \geq 1$

converges if $\rho(T) < 1$ to the solution of $(I-T)x = b$ .

Proof¶

The iterates of the fixed-point method are

$x^{(0)} = x^{(0)},$
$x^{(1)} = Tx^{(0)} + b,$
$x^{(2)} = T^2x^{(0)} + Tb + b,$
$x^{(3)} = T^3x^{(0)} + T^2b + T b + b,$
$\vdots$
$x^{(k)} = T^k x^{(0)} + \sum_{j=0}^{k-1} T^jb$

As $k \to \infty$ , $\|T^k\| \to 0$ and

$\sum_{j=0}^{k-1} T^jb \to (1-T)^{-1} b.$

Therefore

$x = \lim_{k \to \infty} x^{(k)} = (I-T)^{-1} b$

$(I-T)x = b.$

Corollary (Error estimate)¶

The fixed point method given by

$g(x) = Tx + c, \quad x^{(k)}= g(x^{(k-1)}), \quad x^{(0)} \in \mathbb R^{n}, k \geq 1.$

with $\rho(T) < 1$ has an error that satisfies

$\|x^{(k)}- x\| \leq \|T\|^k \|x^{(0)} - x\|,\quad x = (I-T)^{-1}b,$

in any norm. Additionally, for any $\delta > 0$

$\|x^{(k)}- x\| = O( (\rho(T) + \delta)^n ).$

Proof¶

From the proof of the previous theorem,

$x^{(k)} - x = T^k x^{(0)} + \sum_{j=0}^{k-1} T^jb - \sum_{j=0}^{\infty}T^j b = T^k x^{(0)} - \sum_{j=k}^{\infty}T^j b = T^k\left( x^{(0)} - \sum_{j=0}^{\infty}T^j b \right) = T^k (x^{(0)} -x).$

Therefore

$\|x^{(k)} - x \| \leq \|T^k\| \|x^{(0)}-x\| \leq \|T\|^k \|x^{(0)}-x\|.$

The proof of the last fact requires the following two results that we will not prove:

Theorem A¶

For any $n\times n$ matrix $A$ and $\delta > 0$ , there exists an induced matrix norm $\|\cdot\|_A$ (depending on $A$ ) such that

$\|A\|_A \leq \rho(A) + \delta.$

Theorem B¶

Given any two induced matrix norms on $n \times n$ matrices $\|\cdot\|_\alpha$ and $\|\cdot\|_\beta$ there exists constants $0 < C_1 \leq C_2$ such that

$C_1 \|A\|_\beta \leq \|A\|_\alpha \leq C_2 \|A\|_\beta,$

for all matrices $A$ .

To finish the proof of the corollary, given $\delta > 0$ , we choose a matrix norm $\|\cdot\|_T$ such that $\|T\|_T \leq \rho(T) + \delta$ by Theorem A. Then from Theorem B we find some constant $C > 0$

$\|x^{(k)} - x \| \leq \|T^k\| \|x^{(0)}-x\| \leq C \|T^k\|_T \|x^{(0)}-x\| \leq C \|T\|^k_T \|x^{(0)}-x\| \leq C (\rho(T) + \delta)^k \|x^{(0)}-x\|.$

Therefore

$\|x^{(k)}- x\| = O( (\rho(T) + \delta)^k ).$

Jacobi's Method¶

We want to solve the equation $Ax=b$ . If $A$ would happen to be a diagonal matrix, with non-zero diagonal entries, computing $A^{-1} = \mathrm{diag}(a_{11}^{-1},a_{22}^{-1},\ldots,a_{nn}^{-1})$ is simple. But, of course, most linear systems are not diagonal. The Jacobi iterative method is given by the following iteration formula:

$\begin{align*} x^{(0)} &= \text{ an initial guess}\\ x^{(k)}_{i} &= \sum_{j=1, ~~ j\neq i}^n \left( - \frac{a_{ij} x_j^{(k-1)}}{a_{ii}} \right) + \frac{b_i}{a_{ii}}, \quad {i = 1,2,\ldots,n}, ~~ k > 0. \end{align*}$

A = [10, -1 , 2, 0; -1, 11, -1 3; 2, -1, 10, -1; 0, 3,-1,8];
b = [1,1,1,1]'; n = 4; x = zeros(n,1);
x = Jacobi(A,b,.000001,40)
norm(A*x-b)

x =

    0.0876
    0.0786
    0.1011
    0.1082


ans =

   4.9674e-07

Jacobi's method is easier to understand when written in matrix form, even though it should be coded as written above. Let $A = D - L - U$ where $L$ is strictly lower-triangular, $D$ is diagonal and $U$ is strictly upper-triangular. Then define

$x^{(k)} = D^{-1}(L + U) x^{(k-1)} + D^{-1} b.$

You should check that this is the same as the above equations for Jacobi's method. A fixed-point of this iteration would satisfy

$x = D^{-1}(L + U) x + D^{-1} b,$

$(D-L-U)x = b,$

$Ax = b.$

Theorem¶

Jacobi's method converges if $\rho(D^{-1}(L + U))< 1$ .

In particular, this may happen if $D$ has large entries (on the diagonal) and $L$ and $U$ have small entries.

Gauss-Seidel Method¶

The Gauss-Seidel iterative method is a slight modification of Jacobi's method. Recall Jacobi's method

$\begin{align*} x^{(0)} &= \text{ an initial guess}\\ x^{(k)}_{i} &= \sum_{j=1, ~~ j\neq i}^n \left( - \frac{a_{ij} x_j^{(k-1)}}{a_{ii}} \right) + \frac{b_i}{a_{ii}}, \quad {i = 1,2,\ldots,n}, ~~ k > 0. \end{align*}$

We can modify this by replacing the " $k-1$ " on the right-hand side with $k$ when $j \leq i$ , since $x_j^{(k)}$ has been computed for $j < i$ :

$\begin{align*} x^{(0)} &= \text{ an initial guess}\\ x^{(k)}_{i} &= -\frac{1}{a_{ii}}\left[\sum_{j=1}^{i-1} {a_{ij} x_j^{(k)}} + \sum_{j=i+1}^n {a_{ij} x_j^{(k-1)}} \right] + \frac{b_i}{a_{ii}}, \quad {i = 1,2,\ldots,n}, ~~ k > 0. \end{align*}$

A = [10, -1 , 2, 0; -1, 11, -1 3; 2, -1, 10, -1; 0, 3,-1,8];
b = [1,1,1,1]'; n = 4; x = zeros(n,1);
x = Jacobi(A,b,0,10);
norm(A*x-b)
x = GS(A,b,0,10);
norm(A*x-b)

Maximum number of interations reached, Nmax = 10
ans =

   8.2524e-05

Maximum number of interations reached, Nmax = 10
ans =

   1.4308e-10

Let $A = D - L - U$ where $L$ is strictly lower-triangular, $D$ is diagonal and $U$ is strictly upper-triangular. Then define

$x^{(k)} = (D-L)^{-1}U x^{(k-1)} + (D-L)^{-1} b.$

You should check that this gives the same iteration as the formula for Gauss-Seidel. A fixed-point of this iteration would satisfy

$x = (D-L)^{-1}U x + (D-L)^{-1}b,$

$(D-L-U)x = b,$

$Ax = b.$

Theorem¶

The Gauss-Seidel method converges if $\rho((D-L)^{-1}U)< 1$ .
If the entries of $L$ and $U$ in $A = D-L-U$ are non-negative and $\rho(D^{-1}(L +U)) < 1$ (Jacobi converges) then

$\rho((D-L)^{-1}U) < \rho(D^{-1}(L +U)) \quad \text{(Gauss-Siedel converges faster)}$