470 likes | 592 Views
CM0368 Scientific Computing. Spring 2009 Professor David Walker david@cs.cf.ac.uk. Schedule. Weeks 1 & 2 ( 6 ): Numerical linear algebra (DWW) Solving A x = b by Gaussian elimination and Gauss-Seidel. Algebraic eigenvalue problem. Power method and QR method.
E N D
CM0368Scientific Computing Spring 2009 Professor David Walker david@cs.cf.ac.uk
Schedule • Weeks 1 & 2 (6): Numerical linear algebra (DWW) • Solving Ax = b by Gaussianelimination and Gauss-Seidel. • Algebraic eigenvalue problem. Power method and QR method. • Weeks 3 & 4 (6): Numerical solution of differential equations (BMB) • Ordinary differential equations (finite difference and Runge-Kutta methods). • Partial differential equations (finite difference method) • Weeks 5 & 6 (5): Applications in Physics (BMB) • Laplace and Helmholtz equations, Schrodinger problem for the hydrogen atom, etc. • Weeks 6 & 7 (4): Numerical optimisation (DWW) • Weeks 2 – 11: Tutorials on Wednesdays at 1pm in T2.07. These will be given by Kieran Robert.
Text Book • “Numerical Computing with MATLAB” by Cleve B. Moler, SIAM Press, 2004. ISBN 0898715601. • http://www.readinglists.co.uk/rsl/student/sviewlist.dfp?id=20248 • Web edition at http://www.mathworks.com/moler/
Web Site • Lecture notes and other module material can be found at: http://users.cs.cf.ac.uk/David.W.Walker/CM0368
Numerical Linear Algebra • A system of n simultaneous linear equations can be represented in matrix notation as: Ax = b where A is an nn matrix, and x and b are vectors of length n. • Can write solution as x = A-1b whereA-1 is the inverse of A. • A square matrix is said to be non-singular if its inverse exists. If the inverse does not exist then the matrix is singular.
Geometrical Interpretation • If A is a 22 matrix, for example then 2x1 – x2 = 3 and 3x1 + 4x2 = -1. Each represents a straight line and the solution of the above is given by their intersection. • If A is a 33 matrix each of the three equations represents a plane, and the solution is the point lying at the intersection of the three planes.
MATLAB Solution • In MATLAB we can find the solution to Ax = b by writing: x = A\b • Write: A = [10 -7 0;-3 2 6;5 -1 5] • Then: b = [7;4;6] • Then: x = A\b
Gaussian Elimination • Eliminate x1 from all the equations after the first. • Then eliminate x2 from all the equations after the second. • Then eliminate x3 from all the equations after the third. • And so on, until after n-1 steps we have eliminated xj from all the equations after the jth, for j = 1, 2, …, n-1. • These steps are referred to as the forward elimination stage of Gaussian elimination.
Example Subtract -3/10 times equation 1 from equation 2, and 5/10 times equation 1 from equation 3. Next we swap equations 2 and 3. This is called partial pivoting. It is done to get the largest absolute value on or below the diagonal in column 2 onto the diagonal. This makes the algorithm more stable with respect to round-off errors (see later).
Example (continued) Now subtract -0.1/2.5 times equation 2 from equation 3. This completes the forward elimination stage of the Gaussian elimination algorithm.
Pseudocode for Forward Elimination make b column n+1 of A for k=1 to n-1 find pivot A(m,k) in column k and row mk swap rows k and m for i=k+1 to n mult = A(i,k)/A(k,k) for j=i+1 to n+1 A(i,j) = A(i,j) – mult*A(k,j) end end end
Back Substitution • After the forward elimination phase, the matrix has been transformed into upper triangular form. • Equation n just involves xn and so can now be solved immediately. • Equation n-1 just involves xn-1 and xn, and since we already know xn we can find xn-1. • Working our way backwards through the equations we can find xn, xn-1,…, x1. • This is called the back substitution phase of the Gaussian elimination algorithm.
The Example Again Equation 3 is 6.2x3 = 6.2, so x3 = 1. This value is substituted into equation 2: 2.5x2 + (5)(1) = 2.5 so x2 = -1. Substituting for x2 and x3 in equation 1: 10x1 + (-7)(-1) = 7 so x1 = 0.
Pseudocode for Back Substitution x(n) = b(n)/U(n,n) for k=n-1 to 1 sum = 0 for j=k+1 to n sum = sum + U(k,j)*x(j) end x(k) = (b(k) – sum)/U(k,k) end This solves Ux = b
LU Factorisation • The Gaussian elimination process can be expressed in terms of three matrices. • The first matrix has 1’s on the main diagonal and the multipliers used in the forward elimination below the diagonal. This is a lower triangularmatrix with unit diagonal, and is usually denoted by L. • The second matrix, denoted by U, is the upper triangular matrix obtained at the end of the forward elimination. • The third matrix, denoted by P, is a permutation matrix representing the row interchanges performed in pivoting.
L, U, and P The original matrix can be expressed as: LU = PA The permutation matrix is the identity matrix with its rows reordered. If Pij = 1 then row i of the permuted matrix is row j of the original matrix. The same information can be represented in a vector, the ith entry of which gives the number of the column containing the 1 in row i.
Some MATLAB Code • L, U, and P can be found in MATLAB as follows: [L,U,P] = lu(A) • Solution of the system Ax=b is equivalent to solving the two triangular systems Ly = Pb and Ux = y • Once you have L, U, and P it is simple to solve the original system of equations: y = L\(P*b) and x = U\y, or just x = U\(L\(P*b)) • This should give the same answer as: x = A\b
LDU Factorisation • It is sometimes useful to explicitly separate out the diagonal of U, which contains the pivots. • We write U=DU’ where U’ is the same as U except that it has 1’s on the diagonal, and D is a diagonal matrix containing the diagonal elements of U. • With this form of the factorisation we have LDU = PA
Pseudocode for LU Factorisation make b column n+1 of A for k=1 to n-1 find pivot A(m,k) in column k and row mk swap rows k and m for i=k+1 to n A(i,k) = A(i,k)/A(k,k) for j=i+1 to n+1 A(i,j) = A(i,j) – A(i,k)*A(k,j) end end end
Explanation of LU • At stage i of forward elimination we do pivoting to find the largest absolute value in column i on or below the diagonal, and then exchange rows to bring it onto the diagonal. • Then we subtract multiples of row i from rows i+1,…,n. • Each of these operations can be represented by a matrix multiplication.
Elementary Matrices • An elementary matrix, M, is one that has 1’s along the main diagonal and 0’s everywhere else, except for one non-zero value (-m, say) in row i and column j. • Multiplying A by M has the effect of subtracting m times row j of matrix A from row i. • Ignoring pivoting, the GE algorithm applies a series of elementary matrices to A to get U: U = Mn-1….M2M1A • If Li-1=Mi then L1L2…Ln-1U = A so taking L =L1L2…Ln-1 we have LU=A. • Li is the same as Mi except the sign of the non-zero value is changed. • With pivoting U = Mn-1 Pn-1 ….M2P2M1P1A, and it can be shown in a similar way that LU=PA, where P = Pn-1…P2P1.
The Need for Pivoting • Suppose we change our problem slightly to: where the solution is still (0,-1,1)T. • Now suppose we solve the problem on a computer that does arithmetic to five decimal places.
Pivoting (continued) • The first step of the elimination gives: • The (2,2) element is quite small. Now we continue without pivoting. • The next step is to subtract -2.5103 times equation 2 from equation 3: (5-(-2.5 103 )(6))x3 = (2.5-(-2.5 103)(6.001)) • The righthand side is (2.5+1.50025 104). The second term is rounded to 1.5002 104. When we add the 2.5 the result is rounded to 1.5004 104.
Pivoting (continued) • So the equation for x3 becomes: 1.5005 104 x3 = 1.5004 104 which gives x3 = 0.99993 (instead of 1). • Then x2 is found from: -0.001 x2 + (6)(0.99993) = 6.001 which gives: -0.001 x2 = 1.5 10-3 so x2 = -1.5 (instead of -1). • Finally, x1 is found from: 10 x1 + (-7)(-1.5) = 7 which gives x1 = -0.35 (instead of 0). • The problem arises from using a small pivot which leads to a larger multiplier. • Partial pivoting ensures that the multipliers are always less than or equal to 1 in magnitude, and results in a satisfactory solution.
Measuring Errors • The discrepancy due to rounding error in a solution can be measured by the error: e = x – x* and by the residual: r = b - Ax* where x is the exact solution and x* is the computed solution. • e=0 if and only if r=0, however, e and r are not necessarily both small.
Error Example • Consider an example in which the matrix is almost singular. • If GE is used to compute the solution with low precision we get a large error but small residual. • Geometrically, the lines represented by the equation are almost parallel. • GE with partial pivoting will always produce small residuals, but not necessarily small errors. computed solution exact solution
Sensitivity • We want to know how sensitive the solution to Ax=b is to perturbations in A and b. • To do this we first have to come up with some measure of how close to singular a matrix is. • If A is singular, a solution will not exist for some b’s (inconsistency), while for other b’s it is not unique. • So if A is nearly singular we would expect small changes in A and b to cause large changes in x. • If A is the identity matrix then x=b, so if A is close to the identity matrix we expect small changes in A and b to result in small changes in x.
Singularity and Pivots • In GE all the pivots are non-zero if and only if A is non-singular, provided exact arithmetic is used. • So if the pivots are small we expect the matrix to be close to singular. • However, with finite precision arithmetic, a matrix might still be close to singular even if none of the pivots are small.
Norms • Size of pivots is not adequate to decide how close a matrix is to being singular. • Define lp family of norms (1≤p≤): • Most common norms use p = 1, 2, and .
Properties of Norms • All these norms have the following properties: for all scalars c (the triangle inequality) • In MATLAB use norm(x,p) to find a norm: norm1 = norm(x,1) norm2 = norm(x) norminf = norm(x,inf)
Condition Number • Ax may have a very different norm from x, and this change in norm is related to the solution sensitivity to changes in A. Denote: M = max and m = min where the max and min are taken over all non-zero vectors x. Note, if A is singular, m = 0. • The condition number, (A), of A is defined as the ratio M/m. Usually we are interested in the order of magnitude (A), so it doesn’t matter which norm is used.
Relative Errors • Suppose we have an error b in b, which results in an error x in x. So Ax=b becomes: A(x+x) = b+b so Ax= b. From the definition of M and m we have: and Then if m0 we have the following relationship between the relative error in b and in x: (A) measures the amplification of the relative error!
Uses of Condition Number • As a measure of the amplification of relative error due to changes in rhs. • As a measure of the amplification of relative error due to changes in matrix A. • As a measure of how close a matrix is to being singular (hard maths omitted). If (A) is large then A is close to singular.
Some Properties of (A) • (A) 1 since M m. • (P) = 1, if is a permutation matrix. • (cA) = (A) for scalar c. • For D a diagonal matrix (D) is the ratio of the largest diagonal value to the smallest.
Relative Error Example • In this example we use the l1-norm: • Solution is x = (1 0)T, and ||b|| = 13.8 and ||x|| = 1. Change b slightly to: Small change in b gives big change in x. then x becomes • Errors are ||b||=0.01 and ||x||=1.63, so relative errors are: (A) is large = 1.63/0.0007246 = 2249.4
Relative Error and Residual • Condition number is also important in round-off error in GE. It can be shown that: where x* is the numerical solution of Ax=b, is a constant smaller than about 10, and the machine precision. • The first inequality says that the relative residual will be about the same size as the round-off error no matter how ill-conditioned A is. • The second inequality says that the relative error is small if (A) is small but might be large if (A) is large.
Matrix Norms • The norm of matrix A is ||A|| = M, i.e., • Since ||A-1|| = 1/m it follows that the condition number can also be defined as: (A) = ||A|| ||A-1|| • In MATLAB, cond(A,p) computes the condition number of A relative to the lp-norm.
Iterative methods for Ax=b • Gaussian Elimination is a direct method that computes the solution of Ax=b is O(n3/3) operations. • If n is large we might want a faster, less accurate, method. • With an iterative method we can stop the iteration whenever we think we have a sufficiently accurate solution.
Iterative methods • Suppose we split the matrix as A = S-T, then Sx = Tx + b. • We can turn this into an iteration: Sxk+1 = Txk + b or xk+1 = S-1Txk + S-1b • So if this sequence converges then we can start the iteration with a guess at the solution x0 and get an approximate solution. • We need S to be easily invertible
Common Iterative Methods • S = diagonal part of A (Jacobi’s method) • S = triangular part of A (Gauss-Seidel method) • S = combination of 1 and 2 (successive over-relaxation or SOR) S is called a pre-conditioner. The choice of S affects the convergence properties of the solution.
Jacobi Method • S = diag(A) so formula for iteration k+1 becomes: • Expressed in matrices: Dxk+1 = - (L+U)xk + b where D, L, and U are the diagonal, strictly lower triangular, and strictly upper triangular parts of A, respectively. Note: these are not the D, L, and U of the LDU factorisation.
Jacobi Example • Example: Then:
Pseudocode for Jacobi Method choose an initial guess x0 for k=0,1,2,…. for i=1 to n sum = 0.0 for j=1 to i-1 and i+1 to n sum = sum + A(i,j)*xk(j) end xk+1(i) = (b(i)–sum)/A(i,i) end check convergence and continue if needed end
Gauss-Seidel Method • A problem with the Jacobi method is that we have to store all of xk until we have finished computing xk+1. • In the Gauss-Seidel method the xk+1 are used as soon as they are computed, and replace the corresponding xk on the righthand side • This uses half the storage of the Jacobi method.
Gauss-Seidel in Matrix Notation • Expressed in matrices: (D+L)xk+1 = -Uxk + b where D+L is the lower triangular part of A, and U is the strictly upper triangular part of A.
Gauss-Seidel Example • Example: Then: • Gauss-Seidel is better than Jacobi because it uses half the storage and often converges faster.
Pseudocode for Gauss-Seidel choose an initial guess x0 for k=0,1,2,…. for i=1 to n sum = 0.0 for j=1 to i-1 sum = sum + A(i,j)*xk+1(j) end for j=i+1 to n sum = sum + A(i,j)*xk(j) end xk+1(i) = (b(i)–sum)/A(i,i) end check convergence and continue if needed end