Stopping Criteria of the Extended Iterative Refinement Algorithm Used to Solve an Ill Conditioned Linear System

Abdramane Sermé

Paper Information
Next Paper
Paper Submission

American Journal of Computational and Applied Mathematics

p-ISSN: 2165-8935 e-ISSN: 2165-8943

2013; 3(5): 251-258

doi:10.5923/j.ajcam.20130305.01

Stopping Criteria of the Extended Iterative Refinement Algorithm Used to Solve an Ill Conditioned Linear System

Abstract
Reference
Full-Text PDF
Full-text HTML

Abdramane Sermé

Department of Mathematics, the City University of New York, New York, NY, 10007, USA

Correspondence to: Abdramane Sermé, Department of Mathematics, the City University of New York, New York, NY, 10007, USA.

Email:

Abstract

We proved in[1] that our extended iterative refinement or improvement algorithm to solve accurately an ill conditioned linear system converges by providing a theorem that we called the convergence theorem. To compute an accurate solution to an ill conditioned linear system we first use the additive preconditioning to decrease the condition number of the input ill conditioned matrix . Then we use the technique of Schur aggregation to reduce the computation of to the computation of the Schur aggregate . This second step is done using the Sherman-Morrison-Woodbury (SMW) formula. The extended iterative refinement or improvement algorithm should stop when and the Schur aggregate are computed with higher precision. We provide in this paper the condition for the extended iterative refinement or improvement algorithm to stop in a proposition.

Keywords: Stopping criteria, Condition number, Ill conditioned linear system, Sherman-Morrison-Woodbury (SMW) formula, Preconditioning, Schur aggregation, Iterative refinement or improvement, Algorithm, Rounding errors, Singular value decomposition

Cite this paper: Abdramane Sermé, Stopping Criteria of the Extended Iterative Refinement Algorithm Used to Solve an Ill Conditioned Linear System, American Journal of Computational and Applied Mathematics , Vol. 3 No. 5, 2013, pp. 251-258. doi: 10.5923/j.ajcam.20130305.01.

Article Outline

1. Introduction

2. Rounding Errors, Floating-point Summation, Matrix Norms and Convergence

2.1. Rounding Errors

2.2. Floating-point Number System

2.3. Error-free Floating-point Summation

2.4. Matrix Norms

2.4.1. The Singular Value Decomposition (SVD)[3, 15]

2.5. Numerical Nullity

2.6. Convergence

3. The Additive Preconditioning Method, the Schur Aggregation and the Extended Iterative Refinement or Improvement Algorithm

3.1. The Additive Preconditioning Method

3.2. The Schur Aggregation

3.3. The Iterative Refinement or Improvement Algorithm

3.4. The Extended Iterative Refinement or Improvement Algorithm

4. Stopping Criteria of the Extended Iterative Refinement or Improvement

4.1. Analysing the Errors in Algorithm 4

5. Example

6. Conclusions

ACKNOWLEDGEMENTS

1. Introduction

The following extended iterative refinement or improvement algorithm helps us solve the matrix linear system

with higher precision.

for

Using this algorithm the residuals dynamically decrease, which is a must for us. We represent the output value as the sum of matrices with fixed-precision numbers. The extended iterative refinement or improvement algorithm should stop when

and the Schur aggregate

are computed with higher precision. We use

to compute the Schur aggregate

. An accurate solution

of an ill conditioned linear system

is achieved by transforming it using the additive preconditioning

for preconditioner

of smaller rank

and the Schur aggregation. We use the Sherman-Morrison-Woodbury (SMW) formula

where

is an invertible square matrix to get new linear systems of smaller sizes with well conditioned coefficients matrices

, and

. The technique of iterative refinement or improvement for computing the Schur aggregate[15] and its application for solving linear system of equations has been studied in a number of papers[3, 15, 18].

This paper is divided into three sections. The first section covers the concept of rounding errors, floating-point summation, matrix norms and convergence. The second section is devoted to the additive preconditioning, the Schur aggregation and how the iterative refinement or improvement technique is used with the SMW formula to transform the original linear system

into better conditioned linear systems. In the third section of this paper we show in a proposition the stopping criteria of the extended iterative refinement or improvement algorithm.

2. Rounding Errors, Floating-point Summation, Matrix Norms and Convergence

2.1. Rounding Errors

Definition 2.1.1 Let

be an approximation of the scalar x. The absolute error in

approximating

is the number

Definition 2.1.2 Let

be an approximation of a scalar

. The absolute and relative errors of this approximation are the numbers

and

, respectively. If

is an approximation to

with relative error

, then there is a number

such that 1)

and 2)

Remark 2.1.1 The relative error

is independent of scaling, that is the scaling

and

leave

unchanged.

Theorem 2.1[6] Assume that

approximates

with relative error

< 1.

Then

is nonzero and

Remark 2.1.2 If the relative error of

with respect to

, then

and

agree to roughly

correct significant digits. For binary system, if

and

have relative error of approximately

, then

and

agree to about

bits.

Definition 2.1.3 The componentwise relative error is defined as:

for

and is widely used in the error analysis and perturbation theory.

Remark 2.1.3 In numerical computation, one has three main sources of errors.

1. Rounding errors, which are unavoidable consequences of working in finite precision arithmetic.

2. Uncertainty in the input data, which is always a possibility when we are solving practical problems.

3. Truncation errors, which are constituted and introduced by omitted terms.

Rounding errors and Truncation errors are closely related to forward errors.

Definition 2.1.4 Precision is the number of digits in the representation of a real number. It defines the accuracy with which the computations and in particular the basic arithmetic operations

are performed. For floating point arithmetic, precision is measured by the unit roundoff or machine precision, which we denote u in single precision and

in double precision. The values of the unit roundoff are given in Table 1.1 in Section 2.3.

Remark 2.1.4 Accuracy refers to the absolute or relative error of an approximation.

Definition 2.1.5 Let

be an approximation of

computed with a precision

where

is a real function of a real scalar variable.

is called the (absolute) backward error, whereas the absolute or relative errors of

are called forward errors.

Theorem 2.2[8] (Rigal and Gaches). The normwise backward error

is given by

, where

Definition 2.1.6 For an approximation

to a solution of a linear system

with

and

, the forward error is the ratio

The process of bounding the forward error of a computed solution in terms of

is called forward error analysis.

is the perturbation of

Definition 2.1.7 An algorithm is called forward stable if it produces answers with forward errors of similar magnitude to those produced by backward stable method.

Definition 2.1.8 A mixed forward-backward error is defined by the equation

where

with

and

are small constants.

Remark 2.1.5 This definition implies that the computed value ŷ differs just a little from the value

that would have been produced by an input

which is a little bit different from the actual input

. This means that

is almost the right answer for almost the right data.

Definition 2.1.9 An algorithm is called numerically stable if it is stable in the mixed forward and backward error sense.

Remark 2.1.6 A backward stability implies a forward stability but the converse is not true.

Remark 2.1.7 One may use the following rule of thumb;

Forward error

condition number

backward error, with approximate equality possible. Therefore the computed solution to an ill conditioned problem can have a large forward error even if the computed solution has a small backward error. This error can be amplified by the condition number in the transition to forward error. This is one of our motivations for reducing the condition number of the matrix A using the additive preconditioning method.

Definition 2.1.10 For a system of linear equations

is called the relative residual. The relative residual gives us an indication on how closely

represents

and is scale independent.

2.2. Floating-point Number System

Definition 2.2.1[6] A floating-point number system

is a subset of the real numbers whose elements have the form

. The range of the nonzero floating-point numbers in

is given by

Any floating-point number

can be written in the form

where each digit

satisfies

and

for normalized numbers.

is called the most significant digit and

the least significant digit.

2.3. Error-free Floating-point Summation

Here is a summation algorithm due to D.E. Knuth[7].

Algorithm 1. Error-free transformation of the sum of two floating point numbers

The algorithm transforms two input-floating point numbers

and

into two output floating-point numbers

and

such that

and

. The same solution is achieved using the Kahan-Babušhka’s[11] and Dekker’s[12] classical algorithm provided that

. It uses fewer ops but includes branches, which slows down the code optimization outputs.

Definition 2.3.1[8] The unit roundoff error

is the quantity

. We write

and

to denote the operations performed in single precision and in double precision, respectively.

Table 1.1.

Remark 2.3.1 The following theorem shows that every real number

lying in

can be approximated by an element of

with a relative error no larger than

Theorem 2.3 If

lies in

then

with

Theorem 2.3 says that

is equal to

multiplied by a factor very close to 1.

Definition 2.3.2 From now on

, for an argument that is an arithmetic expression, denotes the computed value of that expression. op represents floating-point operation in

2.4. Matrix Norms

2.4.1. The Singular Value Decomposition (SVD)[3, 15]

Definition 2.4.1 The compact singular value decomposition or SVD of an

matrix

of a rank

is the decomposition:

where

and

are unitary matrices, that is,

is a diagonal matrix,

and

are m- and n-dimensional vectors, respectively, and

> 0.

for

is the

largest singular value of the matrix

Definition 2.4.2 The condition number,

of a matrix

of a rank

. A matrix is said to be ill conditioned if its condition number is large, that is if

, and is called well conditioned otherwise.

Definition 2.4.3[6] The matrix 2-norm

, is also called the spectral norm.

denotes the largest singular value of the matrix

Remark 2.4.1 The matrix 2-norm satisfies the relation

where

for

and

Lemma 2.4 For a vector norm

, suppose that

< 1 . Then

2.5. Numerical Nullity

Definition 2.5.1 The nullity of

, is the smallest integer

for which a rank

APC

can define a nonsingular A-modification

. The nullity of

, which is defined as the dimension of the null space can also be defined as the large integer r for which we have

, provided

is a nonsingular matrix. In this case,

and

are the right and left null matrix bases for the matrix A.

2.6. Convergence

There is a natural way to extend the notion of limit from

Definition 2.6.1 Let

be a sequence of vectors in

and let

. The sequence

converges componentwise to

and we write

for

Here is another way to define convergence in

Definition 2.6.2 Let

be a sequence of vectors in

and let

. The sequence

converges normwise to

, that is

if and only if

There is no compelling reason to expect the two notions of convergence to be equivalent. In fact for infinite dimensional vector space, they are not.

Theorem 2.5[6] Let

and suppose that

. Then

is nonsingular and

(this sum is called Neumann sum). A sufficient condition for

is that

< 1 in some consistent norm, in which case

Corollary 2.6 If

, then

is nonnegative and

Theorem 2.7[6] Let

be a matrix norm on

consistent with a vector norm (also denoted

) and let a matrix

. Let

be a square matrix such that

< 1. Then

(i) the matrix

is nonsingular,

(ii)

, and

(iii)

The following corollary extends Theorem 2.7.

Corollary 2.8[6]

, then

Moreover,

so that

The corollary remains valid if all occurrences of

are replaced by

Theorem 2.9[8, 17] Let

denote a matrix norm and a consistent vector norm. If the matrix

is nonsingular and

and (i)

, where

is an approximated value of x, then

(ii)

In addition if

< 1, then

is nonsingular and

(iii)

3. The Additive Preconditioning Method, the Schur Aggregation and the Extended Iterative Refinement or Improvement Algorithm

3.1. The Additive Preconditioning Method

Definition 3.1.1 For a pair of matrices

of size

and

of size

, both having full rank

> 0, the matrix

of rank

is an additive preprocessor (APP) of rank

for any

matrix

. The matrix

is the A-modification. The matrices

and

are the generators of the APP, and the transition

is an A-preprocessing of rank

for the matrix

. An APP

for a matrix A is an additive preconditioning (APC) and an A-preprocessing is an A-preconditioning if

. An APP is an additive compressor (AC) and an A-preprocessing is an A-complementation if the matrix A is rank deficient, whereas the A-modification C has full rank. An APP

is unitary if the matrices

and

are unitary.

Remark 3.1.1 Suppose

has rank r. Then[3, 18], we expect

to be to the order of

, therefore small if the additive preconditioner

i) random

ii) well conditioned, and

iii) properly scaled, that is

is not large and not small.

The additive preconditioning consists in adding a matrix

of a small rank to the input matrix

, to decrease its condition number. The A-modification is supposed to generate a well conditioned matrix

. In practice, to compute the A-modification

error-free, we fill the generators

and

with short binary numbers.

3.2. The Schur Aggregation

The aggregation method consists of transforming an original linear system

into linear systems of smaller sizes with well conditioned coefficients matrices

, and

. The aggregation method is a well known technique[3, 14, 15, 18], but the aggregation used here both decreases the size of the input matrix and improves its conditioning. One may remark that the aggregation can be applied recursively until no ill conditioned matrix appears in the computation.

Definition 3.2.1 The Schur aggregation is the process of reducing the linear system

by using the SMW (Sherman-Morrison-Woodbury) formula

The matrix

, which is the Schur complement (Gauss transform) of the block

in the block matrix

, is called the Schur aggregate. The A-modification

and the Schur aggregate

are well conditioned, therefore the numerical problems in the inversion of the matrix A are confined to the computation of the Schur aggregate

3.3. The Iterative Refinement or Improvement Algorithm

Let

. Then, we apply the SMW formula to the original linear system

and transform it into better conditioned linear systems of smaller sizes, with well conditioned matrices

and

. We solve the original linear system

by post-multiplying

by the vector b. We consider the case where the matrices C and S are well conditioned, whereas the matrices

and

have small rank r, so that we can solve the above linear systems with the matrices

and

faster and more accurately than the system with the matrix

. In this case the original conditioning problems for a linear system

are restricted to the computation of the Schur aggregate

To compute the Schur aggregate

with precision, we begin with computing

using the iterative refinement or improvement algorithm. We prove that we can get very close to the solution

of the linear system

. We closely approximate it by working with numbers rounded to the IEEE standard double precision and using error-free summation. All norms used in this section are the 2-norm. The iterative refinement or improvement algorithm is a technique for improving the computed approximate solution

of a linear system

. The iterative refinement or improvement for the Gaussian Elimination (GE) was used in the 1940s on desk calculators, but the first thorough analysis of the method was given by Wilkinson in 1963. The process consists of three steps ([6],[7],[15]).

Algorithm 2. Basic iterative refinement or improvement algorithm

Input: An

matrix

, a computed solution

and a vector b.

Output: A solution vector

approximating

and an error bound

Initialize:

Computations:

1) Compute the residual

in double precision (

)

2) Solve

in single precision (

) using the GEPP

3) Update

in double precision (

)

Repeat stages 1-3 until

is accurate enough.

Output

and an error bound.

The iterative refinement or improvement algorithm can be rewritten as follows[6].

is the error in the computation of

and

, where

is the perturbation to the matrix A.

is a computed solution of the linear system

Repeat stages 1-3.

The algorithm yields a sequence of approximate solutions

which converge to

We use an extension of Wilkinson’s iterative refinement or improvement to compute the matrix

with extended precision. In its classical form above the algorithm is applied to a single system

, where

and are

vectors. We applied it to the matrix equation

, where the solution we are seeking is the matrix

. In its classical version also the refinement stops when the matrix

is computed with at most double precision. In order to achieve the high precision in computing

, we apply a variant of the extended iterative refinement or improvement where the residuals dynamically decrease, which is a must for us. We represent the output value as the sum of matrices with fixed-precision numbers.

3.4. The Extended Iterative Refinement or Improvement Algorithm

Suppose

is an ill conditioned non-singular

matrix with

where

is the numerical nullity of the matrix

is a random, well conditioned and properly scaled APC of rank r < n, and the well conditioned A-modification

. We use the matrices

and

whose entries can be rounded to a fixed (small) number of bits to control or avoid rounding errors in computing the matrix

Surely, small norm perturbations of the generators

and

, caused by truncation of their entries, keep the matrix

well conditioned. We write the extended iterative refinement or improvement algorithm to solve the linear system

with

and

as follows.

Algorithm 3.

(0.1)

(0.2)

(0.3)

Algorithm 4. Let us solve the linear system

, derived from the ill conditioned linear system

, by applying the following extended iterative refinement or improvement algorithm.

(0.4)

(0.5)

(0.6)

Let

We have the following theorem.

Theorem 3.1[1]

< 1 and

for

then

In other words,

is bounded by

for a certain integer

4. Stopping Criteria of the Extended Iterative Refinement or Improvement

4.1. Analysing the Errors in Algorithm 4

The solution

of the linear system

is computed by means of Gaussian Elimination with Partial Pivoting (hereafter GEPP), which is a backward stable process. It is corrupted by rounding errors of the computation of

in (0.4), so that the computed matrix

(computed in single precision arithmetic

) turns into

is the perturbation to the matrix

. We can also say equivalently that there exists an error matrix

such that

where

That is,

is an exact solution for the approximated problem.

is a constant function of order

We recall that the summation (0.6) is done error free[7].

Proposition 4.1. (Stopping criteria of the extended iterative refinement or improvement algorithm)

If Algorithm 4. is the linear equation solver of the linear system

derived from the ill conditioned linear system

then, the extended iterative refinement or improvement should stop when the error

of the residual

is bounded by

In another word the extended iterative refinement or improvement should stop when

Proof

The backward error and the condition number of the matrix help determine when to stop an iterative refinement or improvement algorithm. Stopping an iterative method is related to the backward (or forward) error in the order of the machine round off

where

is a constant of order

and

is the perturbation to the matrix

, then the extended iterative refinement or improvement algorithm cannot distinguish between

and

There is no need at that point to get a smaller backward error and the extended iterative refinement or improvement shall stop.

From theorem 2.2 and

(0.5)

we deduce that

since (0.5) is computed in double precision

and (0.6) is computed error free[7] . Dividing both side by

yields

Using theorem 2.2 we conclude that

is a bound of the normwise backward error. Therefore we have,

is equivalent to

That is the extended iterative refinement or improvement algorithm should stop when

Which means when

is in the order of

5. Example

In the following example ones can see that the iterations shall stop when

at which point the Schur aggregate

and

are computed with higher precision. The error

of the residual

which dynamically decreases is such that

where

With

and

computed in higher precision we generate a more accurately solution

of the linear system

6. Conclusions

To find an accurate solution

to an ill conditioned linear system

we used a variant of Wilkinson iterative refinement or improvement algorithm called the extended iterative refinement or improvement algorithm. We reduced the computation of

to the computation of the Schur aggregate

which needed to be computed with higher precision. We used the technique of the additive preconditioning for preconditioner

of smaller rank

to decrease the condition number of the input matrix

. The Schur aggregation method was also used to transform the original linear system

into linear systems of smaller sizes with well conditioned coefficients matrices. Our extended iterative refinement or improvement algorithm makes the residual decrease dynamically. We showed using the residual and the backward error that our extended iterative solver should stop when

That is

ACKNOWLEDGEMENTS

The author would like to thank Professor Victor Pan, Distinguished professor of The City University of New York for his support and advice. The author would also like to thank his wife Lisa C. Serme for her support.

References

[1]	A. Serme, J. W. Richard, The Schur Aggregation and Solving Ill Conditioned Linear Systems: The convergence theorem, Afrika Matematika/Springer-verlag, 24(3), 427-437, 2012. DOI: 10.1007/s13370-012-0066-x.
[2]	A. Serme, On Iterative Refinement/Improvement of the Solution to an Ill Conditioned Linear System, Ph.D. thesis under the surpervision of Professor Victor Y. Pan, CUNY Ph.D. Program in Mathematics, Graduate Center, The City University of New York, 2008.
[3]	V. Y. Pan et al., Additive preconditioning and aggregation in matrix computations, Computers and mathematics with applications, 55(8), 1870-1886, 2008.
[4]	V. Y. Pan, Structured Matrices and Polynomials: Unified Superfast Algorithms, Boston/New York: Birkhäuser/ Springer, 2001.
[5]	W V. Y. Pan, Y. Yu, Certification of Numerical Computation of the Sign of the Determinant of a Matrix, Algorithmica, vol. 30, 708-724, 2001.
[6]	G. W. Stewart, Matrix Algorithms, Vol. I: Basic Decompositions, Philadelphia: SIAM, 1998.
[7]	D. E. Knuth, The Art of Computer Programming: Volume 2, Seminumerical Algorithms, Reading, Massachusetts: Addison-Wesley, 1969 (first edition), 1981 (second edition), 1998 (third edition).
[8]	N. J. Higham, Accuracy and Stability in Numerical Analysis, Philadelphia: SIAM, 2002 (second edition).
[9]	T. Ogita, S. M. Rump, S. Oishi, Accurate Sum and Dot Product, SIAM Journal on Scientific Computing, 26(6), 1955-1988, 2005.
[10]	S. M. Rump, T. Ogita, S. Oishi, Accurate Floating-Point Summation, Tech. Report 05.12, Faculty for Information and Communication Sciences, Hamburg University of Technology, November 2005.
[11]	J I. Babuška, Numerical Stability in Mathematical Analysis, Information Processing, 68 (Proc. of IFIP Congress), North- Holland, Amsterdam, pp. 11-23, 1969.
[12]	T. J. Dekker, A Floating-Point Technique for Extending the Available Precision, Numerische Math., vol. 18, pp. 224-242, 1971.
[13]	N. J. Higham, The Accuracy of Floating Point Summation, SIAM Journal on Scientific Computing, vol. 14, pp. 783-799, 1993.
[14]	J. Demmel, Y. Hida, W. Kahan, X. S. Li, Soni Mukherjee, E. J. Riedy, Error Bound from Extra Precise Iterative Refinement, Computer Science Division, Technical Report UCB//CSD-04-1344, University of California, Berkeley, February 2005.
[15]	W. L. Miranker, V. Y. Pan, Methods of Aggregations, Linear Algebra and Its Application, vol. 29, pp. 231-257, 1980.
[16]	G. H. Golub, C. F. Van Loan, Matrix Computations, 3rd edition, Baltimore, Maryland: The Johns Hopkins University Press, 1996.
[17]	G. W. Stewart, Matrix Algoritms, Vol II: Eigensystems, Philadelphia: SIAM, 1998.
[18]	V. Y. Pan, D. Ivolgin, B. Murphy, R. E. Rosholt, M. Tabanjeh, The Schur aggregation for solving linear systems of equations, SNC’07, July 25-27, 2007.

Paper Information

Journal Information

Stopping Criteria of the Extended Iterative Refinement Algorithm Used to Solve an Ill Conditioned Linear System

Article Outline

1. Introduction

2. Rounding Errors, Floating-point Summation, Matrix Norms and Convergence

2.1. Rounding Errors

2.2. Floating-point Number System

2.3. Error-free Floating-point Summation

2.4. Matrix Norms

2.4.1. The Singular Value Decomposition (SVD)[3, 15]

2.5. Numerical Nullity

2.6. Convergence

3. The Additive Preconditioning Method, the Schur Aggregation and the Extended Iterative Refinement or Improvement Algorithm

3.1. The Additive Preconditioning Method

3.2. The Schur Aggregation

3.3. The Iterative Refinement or Improvement Algorithm

3.4. The Extended Iterative Refinement or Improvement Algorithm

4. Stopping Criteria of the Extended Iterative Refinement or Improvement

4.1. Analysing the Errors in Algorithm 4

5. Example

6. Conclusions

ACKNOWLEDGEMENTS

References