180 likes | 491 Views
Lecture 2: Parallel computational models. Control Unit (contains algorithms) . Data. Memory. CPU. (unlimited size). Sequential computational models. Turing machine RAM ( Figure ) Logic circuit model RAM (Random Access Machine) Operations supposed to be executed in one unit time
E N D
Control Unit (contains algorithms) Data Memory CPU (unlimited size) Sequential computational models • Turing machine • RAM (Figure ) • Logic circuit model • RAM (Random Access Machine) Operations supposed to be executed in one unit time • (1)Control operations such asif,goto • (for and while can be realized by for and goto. ) • (2)I/O operations such asprint • (3)Substitution operations such asa = b • (4)Arithmetic and logic operations such as +, -, AND.
O-notation for computing complexity • Definition • Assume that f(n) is a positive function. If there are two positive constants c, n0 such that • f(n) ≦ c g(n) for all n ≧ n0, • then we say • f(n) = O( g(n) ). • For example, • 3n2-5n = O(n2) • n log n + n = O(n log n) • 45 = O(1) • (The item which grows most quickly)
Algorithm analysis for sequential and parallel algorithms • Sequential algorithmsParallel algorithms • Models RAM Many types • Data division Not necessary Most important • AnalysisComputing time Computing time • Memory size Communicating time • Number of processors
RAM 1 RAM 2 Shared Memory RAM m PRAM (Parallel RAM) model • PRAM consists ofa number of RAM (Random Access Machine) and a shared memory. Each RAM has a unique processor number. • Processors act synchronously. • Processor execute the same program. • (According to the condition fork based on processor numbers, it is • possible to executed different operations.) • Data communication between processors (RAMs) are held through the shared common memory. • Each processor can write data to and read data from one memory cell in O(1) time.
Features of PRAM • Merits • Parallelism of problems can be considered essentially. • Algorithms an be easily designed. • Algorithms can be changed easily to ones on other parallel computational models. • Demerits • Communicational cost is not considered. (It is not realistic that one synchronized reading and writing can be done in one unit time.) • Distributed memory is not considered. In the following, We use PRAM to discuss parallel algorithms.
Analysis of parallel algorithms on PRAM model • Computing time T(n) • Number of processors P(n) • Cost P(n) × T(n) • Speed-up Ts(n)/T(n) • Ts(n): Computing time of the optimal sequential algorithm) • Cost optimal parallel algorithms • The cost is the same as the computing time of the optimal sequential • algorithm, i.e., speed-up is the same as the number of processors. • 2. Time optimal parallel algorithms • Fastest when using polynomial number of processors. • 3. Optimal parallel algorithms • Cost and time optimal.
Analysis of parallel algorithms on PRAM model NC-class and P-class • World of sequential computation • P problems:the class of problems which can be solved in polynomial time (O(n )). • NP problems:the class of problems which can be solved non-determinately in polynomial time. • NP-complete problems: the class of NP problems which can be reduced to each other. • P = NP ? • World of parallel computation • NC Problems: the class of problems which can be solved in log-polynomial time • (O(lg n) ). • P-complete problems:the class of problems which are not NC problems and can be reduced to each other. • Similarly, NC = P ? k k
An Example of PRAM Algorithms • Problem:Find the sum of n integers (x1, x2, ... , xn) • - Assume that n integers are put in array A[1..n] on the shared memory. - To simplify the problem, let n = 2k (k is an integer). main () { • for (h=1; h≦log n; h++) • { • if (index of processor i≦ n/2h) processor i do • { • a = A[2i-1]; /* Reading from the shared memory*/ • b = A[2i]; /* Reading from the shared memory*/ • c = a + b; • A[i] = c; /* Writing to the shared memory */ • } • } • if (the number of processor == 1) printf("%d¥n", c); • }
P P P i 1 2 A[n] An Example of PRAM Algorithms • Processor Pi reads A[2i-1], A[2i] from the shared memory, then writes their summation to A[i] of the shared memory. A[1] A[2] A[3] A[4] A[2i-1] A[i] A[2i]
Output P Step 3 1 Parallel algorithm Step 2 P P 1 2 P P P P Step 1 4 3 1 2 x x x x x x x x Input 1 2 3 4 5 6 7 8 Sequential algorithm An Example of PRAM Algorithms Find the summation of 8 integers (x1, x2, ... , x8).
An Example of PRAM Algorithms • Analysis of the algorithm • Computing time:for loop is repeated log n times, each loop can be executed in O(1) time →O(log n) time • Number of processors:not larger than n/2 →n/2 processors • Cost:O(n log n) It is not cost optimal since the optimal sequential algorithm run in Θ(n) time.
Classification of PRAM by the access restriction • EREW (Exclusive read exclusive write) PRAM • Both concurrent reading and concurrent writing are prohibited. • CREW (Concurrent Read Exclusive write) PRAM • Concurrent reading is allowed, but concurrent writing is prohibited.
Classification of PRAM by the access restriction • CRCW (Concurrent Read Concurrent write) PRAM • Both concurrent reading and concurrent writing are allowed. • It is classified furthermore: • - common CRCW PRAM • Concurrent writing happens is only if the writing data are the same. • - arbitrary CRCW PRAM • An arbitrary data is written. • - priority CRCW PRAM • The processor with the smallest number writes the data.
Algorithms on different PRAM models Algorithms for calculating and of n bits (Input is put in array A[1..n]) • Algorithm on EREW PRAM model main (){ • for (h=1; h≦log n; h++) { • if (index of processor i ≦ n/2h) processor i do { • a = A[2i-1]; • b = A[2i]; • if ((a==1) and (b==1)) a[i] = 1; • }}} • Algorithm on common CRCW PRAM model main (){ • if (A[index of processor i] == 1) processor i do • A[1] = 1; • } O(log n) time n/2 processors O(1) time n processors Abilities of PRAM models: EREW < CREW < CRCW
Exercise • 1. Suppose nxn matrix A and matrix B are saved in two dimension arrays. Design a PRAM algorithm for A+B using n and nxn processors, respectively. Answer the following questions: • What PRAM models that you use in your algorithms? • What are the runingstime? • Are you algorithms cost optimal? • Are your algorithms time optimal? • 2. Design a PRAM algorithm for A+B using k (k <= nxn processors). Answer the same questions.