• 130 likes • 527 Views
Sieve of Eratosthenes. by Fola Olagbemi. Outline. What is the sieve of Eratosthenes? Algorithm used Parallelizing the algorithm Data decomposition options for the parallel algorithm Block decomposition (to minimize inter-process communication) The program (returns count of primes)
E N D
Sieve of Eratosthenes by Fola Olagbemi
Outline • What is the sieve of Eratosthenes? • Algorithm used • Parallelizing the algorithm • Data decomposition options for the parallel algorithm • Block decomposition (to minimize inter-process communication) • The program (returns count of primes) • References
What is the sieve of Eratosthenes? • An algorithm designed by the Greek mathematician Eratosthenes: used to find the prime numbers between two and another specified number. • It works by gradually eliminating multiples of the smallest unmarked number (x) in the given interval, till > the specified number.
Algorithm • Create a list of natural numbers in the specified range e.g. 2, 3, 4, ….. , n, none of which is marked • Set k to 2 (or smallest number in list) where k is the smallest unmarked number in the list • Repeat • Mark all multiples of k between square of k and n • Find smallest number greater than k that is unmarked. Set k to the new value, until > n • The unmarked numbers are prime.
Parallelizing the algorithm • Step 3a – key parallel computation step • Reduction needed in each iteration to determine new value of k (process 0 – controller - may not have the next k) • Broadcast needed after reduction to inform all processes of new k. • The goal in parallelizing is to reduce inter-process communication as much as possible.
Data decomposition options • Goal of data decomposition approach chosen – each process should have roughly the same number of values to work on (optimal load balancing) • Decomposition options: • Interleaved data decomposition • Block data decomposition: • Grouped • Distributed
Data decomposition options (contd.) • Interleaved data decomposition (e.g.): • Process 0 processes numbers 2, 2+p, 2+2p, … • Process 1 processes numbers 3, 3+p, 3+2p, … • Advantage • easy to know which process is working on the value in any array index i • Disadvantages • It can lead to significant load imbalance among processes (depending on how array elements are divided up among processes) • May require a significant amount of reduction/broadcast (communication overheads)
Data decomposition options (contd.) • Block data decomposition: • Grouped: first few processes have more values than the last couple of processes • Distributed: First process has less, second has more, third has less etc. (this is the option used in the program) Task 0 Task 1 Task 3 Task 2 Grouped Task 0 Task 1 Task 3 Task 2 Distributed
Data decomposition (contd.) • Grouped data decomposition: • Let r = n mod p (n – number of values in the list, p – number of processes) • If r = 0, all processes get block of values of size n/p • Else, first r processes get blocks of size ceil(n/p), remaining processes blocks of size (floor(n/p) • Distributed data decomposition: • First element controlled by process i is floor(i*n/p) • Last element controlled by process I is floor((i+1)n/p) - 1
Data decomposition (Contd.) • To reduce communication (i.e. reduce number of MPI_Reduce calls to 1), all the primes used for sieving must be held by process 0. • This requirement places a limitation on the number of processes that can be used in this approach.
References • Michael J. Quinn, Parallel Programming in C with MPI and OpenMP;McGraw Hill; 2004