300 likes | 310 Views
Iterative Methods, Gaussian Elimination, Indirect Methods, Convergence, Accuracy, Sparse Matrix, Stable, Jacobi Iteration, Gauss-Seidel Iteration, Parallelization
E N D
Iterative Methods • Gaussian elimination is considered to be a direct method to solve a system. • An indirect method produces a sequence of values that converges to the solution of the system. • Computation is halted in an indirect algorithm when a specified accuracy is reached.
Why Iterative Methods? • Sometimes we don't need to be exact. • Input has inaccuracy, etc. • Only requires a few iterations • If the system is sparse, the matrix can be stored in a different format. • Iterative methods are usually stable. • Errors are dampened
7 -6 x1 3 -8 9 x2 -4 = Iterative Methods • Consider the following system. • Now write out the equations • solve for the ith unknown in the ith equation x1 = 6/7 x2 + 3/7 x2 = 8/9 x1 - 4/9
Iterative Methods • Come up with an initial guess for each xi • Hopefully, the equations will produce better values and which can be used to calculate better values, etc. and will converge to the answer.
Iterative Methods • Jacobi iteration • Use all old values to compute new values k x1 x2 0 0.00000 0.00000 10 0.14865 -0.19820 20 0.18682 -0.24908 30 0.19662 -0.26215 40 0.19913 -0.26551 50 0.19977 -0.26637
Jacobi Iteration • The ith equation has the form - 1 n å = A [ i , j ] x [ j ] b [ i ] = 0 j • Which can be rewritten as: æ ö 1 å ç ÷ = - x [ i ] b [ i ] A [ i , j ] x [ j ] ç ÷ A [ i , i ] è ø ¹ j i
r [ i ] = + x [ i ] x [ i ] A [ i , i ] Jacobi Iteration • The vector (b - Ax) is zero when and if we get the exact answer. • Define this vector to be the residual r • Now rewrite the solution equation
void Jacobi(float A[][], float b[], float x[], float epsilon) { int k = 0; float x1[]; float r[]; float r0norm; // Randomly select an initial x vector r = b - Ax; // This involves matrix-vector mult etc. r0norm = ||r||2; // This is basically the magnitude while (||r||2 > epsilon * r0norm) { for (j = 0; j < n; j++) x1[j] = r[j] / A[j,j] + x[j]; r = b - Ax; } x = x1; }
Parallelization of Jacobi • 3 main computations per iteration • Inner product (2 norm) • Loop calculating x[j]s • Matrix-vector mult. to calculate r • If A[j,j], b[j], & r[j] are on the same proc. • Loop requires no communication • Inner product and Matrix-vector mult require communication.
Inner Product (2 norm of residual) • Suppose data is distributed row-wise • Inner product (2 norm) is simply dot product • IP = Sum(x[j] * x[j]) • This only requires a global sum collapse • O(log p) A x = b P0 P1 P2 P3
Matrix-Vector Multiplication • Again data is distributed row-wise • Each proc. requires all of the elements in the vector to calculate their part of the resulting answer. • This results in all to all gather • O(p log p)
Jacobi Iteration • Resulting cost for float (4 bytes) • Tcomm = #iterations * (TIP + TMVM) • TIP = log p * (ts + tw * 4) • TMVM = p log p * (ts + tw * nrows/p * 4)
Iterative Methods • Gauss-Seidel • Use the new values as soon as available k x1 x2 0 0.00000 0.00000 10 0.21977 -0.24909 20 0.20130 -0.26531 30 0.20008 -0.26659 40 0.20000 -0.26666
æ ö - - 1 1 i n 1 ç ÷ å å = - - x [ i ] b [ i ] x [ j ] A [ i , j ] x [ j ] A [ i , j ] ç ÷ - 1 k k k A [ i , i ] è ø = = + 0 1 j j i Gauss-Seidel Iteration • The basic Gauss-Seidel iteration is
Gauss-Seidel Iteration • Rule: Always use the newest values of the previously computed variables. • Problem: Sequential? • Gauss-Seidel is indeed sequential if the matrix is dense. • Parallelism is a function of the sparsity and ordering of the equation matrix.
Gauss-Seidel Iteration • We can increase possible parallelism by changing the numbering of a system.
Parallelizing Red-Black GSI • Partitioning? Block checkerboard. • Communication? 2 phases per iteration. 1- compute red cells using values from black cells 2- compute black cells using values from red cells Communication is required for each phase.
Partitioning P0 P1 P2 P3
P0 P1 P2 P3 Communication
Procedure Gauss-SeidelRedBlack while ( error > limit ) send black values to neighbors recv black values from neighbors compute red values send red values to neighbors recv red values from neighbors compute black values compute error /* only do every so often */ endwhile
Extending Red-Black Coloring • Goal: Produce a graph coloring scheme such that no node has a neighbor of the same color. • Simple finite element and finite difference methods produce graphs with only 4 neighbors. • Two colors suffice • What about more complex graphs?
More complex graphs • Use graph coloring heuristics. • Number the nodes one color at a time.
Successive Overrelaxation • Devised to speed up the convergence of Gauss-Seidel • apply extrapolation using a weighted average between current & previous iterates • Choose weighting that will accelerate the rate of convergence
æ æ ö ö - - - - 1 1 1 1 i i n n 1 1 ç ç ÷ ÷ å å å å = = - - - - ∂ x [ [ i i ] ] b b [ [ i i ] ] x x [ [ j j ] ] A A [ [ i i , , j j ] ] x x [ [ j j ] ] A A [ [ i i , , j j ] ] ç ç ÷ ÷ - - 1 1 k k k k k A A [ [ i i , , i i ] ] è è ø ø = = = = + + 0 0 1 1 j j j j i i SOR • Gauss-Seidel iteration • Don’t compute directly into x • Compute weighted average
SOR • SOR Equation • Choose w in the range [0, 2] • technically w < 1 is underrelaxation • if you choose w > 2 it won’t converge • if you choose w incorrectly it won’t converge
SOR Algorithm Choose an initial guess for x[i] for k = 1,2….. for i = 1,2, … n ∂ = 0 for j = 1,2,…., i-1 ∂ = ∂ + a[i,j] * xk[j] end for j = i+1, … , n ∂ = ∂ + a[i,j] * xk-1[j] end ∂ = (b[i] - ∂) / a[i,i] xk[i] = xk-1[i] + w * (∂ - xk-1[i]) end check for convergence end
Parallelizing SOR • Just like Gauss-Seidel • Create the dependency graph • Color • Number by color • Phases • Communicate nodes of previous phase • Compute nodes of this phase • Move to next phase
Conclusion • Iterative methods are used when an exact answer is not computable or needed. • Gauss-Seidel converges faster than Jacobi but parallelism is trickier. • Finite element codes are simply systems of equations • Solve with either Jacobi or Gauss-Seidel
Consider • Are systems of equations and finite element methods related?