310 likes | 520 Views
CUDA Lecture 6 Embarrassingly Parallel Computations. Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron. Embarrassingly Parallel Computations.
E N D
CUDA Lecture 6Embarrassingly Parallel Computations Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Embarrassingly Parallel Computations • A computation that can obviously be divided into a number of completely independent parts, each of which can be executed by a separate process(or). • No communication or very little communication between processes. • Each process can do its tasks without any interaction with other processes. Embarrassingly Parallel Computations – Slide 2
Embarrassingly Parallel Computation Examples • Low level image processing • Mandelbrot set • Monte Carlo calculations Embarrassingly Parallel Computations – Slide 3
Topic 1: Low level image processing • Many low level image processing operations only involve local data with very limited if any communication between areas of interest. Embarrassingly Parallel Computations – Slide 4
Partitioning into regions for individual processes Square region for each process Can also use strips Embarrassingly Parallel Computations – Slide 5
Some geometrical operations • Shifting • Object shifted by x in the x-dimension and y in the y-dimension: x = x + x y = y + y where x and y are the original and x and y are the new coordinates. • Scaling • Object scaled by a factor of Sx in the x-direction and Sy in the y-direction: x = xSxy = ySy Embarrassingly Parallel Computations – Slide 6
Some geometrical operations (cont.) • Rotation • Object rotated through an angle about the origin of the coordinate system: x = xcos + y sin y = -x sin + ycos • Clipping • Applies defined rectangular boundaries to a figure and delete points outside the defined area. • Display the point (x,y) only if xlow x xhighylow y yhigh otherwise discard. Embarrassingly Parallel Computations – Slide 7
Notes • GPUs sent starting row numbers • Since these are related to block and thread i.d.s each GPU should be able to figure it out for themselves • Return results are single values • Not good programming but done to simplify first exposure to CUDA programs • No terminating code is shown Embarrassingly Parallel Computations – Slide 8
Topic 2: Mandelbrot Set • Set of points in a complex plane that are quasi-stable (will increase and decrease, but not exceed some limit) when computed by iterating the function zk+1 = zk2 + c where zk+1 is the (k+1)th iteration of the complex number z = a + bi and c is a complex number giving position of point in the complex plane. • The initial value for z is zero. Embarrassingly Parallel Computations – Slide 9
Mandelbrot Set (cont.) • Iterations continued until magnitude of z is greater than 2 or number of iterations reaches arbitrary limit. • Magnitude of z is the length of the vector given by Embarrassingly Parallel Computations – Slide 10
Sequential routine computing value of one point returning number of iterations Embarrassingly Parallel Computations – Slide 11
Notes • Square of length compared to 4 (rather than comparing length to 2) to avoid a square root computation • Given the terminating conditions, all of the Mandelbrot points must be within a circle centered at the origin having radius 2 • Resolution is expanded at will to obtain interesting images Embarrassingly Parallel Computations – Slide 12
Mandelbrot set Embarrassingly Parallel Computations – Slide 13
Parallelizing Mandelbrot Set Computation • Dynamic Task Assignment • Have processor request regions after computing previous regions. • Doesn’t lend itself to a CUDA solution. • Static Task Assignment • Simply divide the region into a fixed number of parts, each computed by a separate processor. • Not very successful because different regions require different numbers of iterations and times. • …but what we’ll do in CUDA. Embarrassingly Parallel Computations – Slide 14
Dynamic Task Assignment: Work Pool/Processor Farms Embarrassingly Parallel Computations – Slide 15
Simplified CUDA Solution Embarrassingly Parallel Computations – Slide 16
Simplified CUDA Solution (cont.) Embarrassingly Parallel Computations – Slide 17
Simplified CUDA Solution (cont.) Embarrassingly Parallel Computations – Slide 18
Simplified CUDA Solution (cont.) Embarrassingly Parallel Computations – Slide 19
Topic 3: Monte Carlo methods • An embarrassingly parallel computation named for Monaco’s gambling resort city, method’s first important use in development of atomic bomb during World War II. • Monte Carlo methods use random selections. • Given a very large set and a probability distribution over it • Draw a set of samples identically and independently distributed • Can then approximate the distribution using these samples Embarrassingly Parallel Computations – Slide 20
Applications of Monte Carlo Method • Evaluating integrals of arbitrary functions of 6+ dimensions • Predicting future values of stocks • Solving partial differential equations • Sharpening satellite images • Modeling cell populations • Finding approximate solutions to NP-hard problems Embarrassingly Parallel Computations – Slide 21
Monte Carlo Example: Calculating • Form a circle within a square, with unit radius so that the square has sides 22. Ratio of the area of the circle to the square given by • Randomly choose points within square. • Keep score of how many points happen to lie within circle. • Fraction of points within the circle will be /4 given a sufficient number of randomly selected samples. Embarrassingly Parallel Computations – Slide 22
Calculating (cont.) Embarrassingly Parallel Computations – Slide 23
Calculating (cont.) x xx x xx x x xxxx x x xx x x xx Embarrassingly Parallel Computations – Slide 24
Relative Error • Relative error is a way to measure the quality of an estimate • The smaller the error, the better the estimate • a: actual value • e: estimated value • Relative error = |e-a|/a Embarrassingly Parallel Computations – Slide 25
Increasing Sample Size Reduces Error Embarrassingly Parallel Computations – Slide 26
Next Example: Computing an Integral • One quadrant of the construction can be described by the integral • Random numbers (xr,yr) generated, each between 0 and 1. • Counted as in circle if xr2+yr2 1. Embarrassingly Parallel Computations – Slide 27
Example • Computing the integral • Sequential code Routine randv(x1,x2) returns a pseudorandom number between x1 and x2. • Monte Carlo method very useful if the function cannot be integrated numerically (maybe having a large number of variables). Embarrassingly Parallel Computations – Slide 28
Why Monte Carlo Methods Interesting • Error in Monte Carlo estimate decreases by the factor 1/n • Rate of convergence independent of integrand’s dimension • Deterministic numerical integration methods do not share this property • Hence Monte Carlo superior when integrand has 6 or more dimensions • Furthermore Monte Carlo methods often amenable to parallelism • Can find an estimate about p times faster or reduce error of estimate by p Embarrassingly Parallel Computations – Slide 29
Summary • Embarrassingly Parallel Computation Examples • Low level image processing • Mandelbrot set • Monte Carlo calculations • Applications: numerical integration • Related topics: Metropolis algorithm, simulated annealing • For parallelizing such applications, need best way to generate random numbers in parallel. Embarrassingly Parallel Computations – Slide 30
End Credits • Based on original material from • The University of Akron • Tim O’Neil, SaranyaVinjarapu • The University of North Carolina at Charlotte • Barry Wilkinson, Michael Allen • Oregon State University: Michael Quinn • Revision history: last updated 7/28/2011. Embarrassingly Parallel Computations – Slide 31