90 likes | 263 Views
Advanced Computer Networks. Lecture 1 - Parallelization. Scale increases complexity. Wide-area network. Multicore server. Large-scale distributed system. Single-core machine. Cluster. More challenges. Wide-area network Even more failure modes Incentives, laws, . Network
E N D
Advanced Computer Networks Lecture 1 - Parallelization
Scale increases complexity Wide-areanetwork Multicoreserver Large-scale distributed system Single-coremachine Cluster More challenges Wide-area network Even more failuremodes Incentives, laws, ... Network Message passing More failure modes(faulty nodes, ...) Trueconcurrency
Parallelization void bubblesort(int nums[]) { boolean done = false; while (!done) { done = true; for (int i=1; i<nums.length; i++) { if (nums[i-1] > nums[i]) { swap(nums[i-1], nums[i]); done = false; } } } } • The algorithm works fine on one core • Can we make it faster on multiple cores? • Difficult - need to find something for the other cores to do • There are other sorting algorithms where this is much easier • Not all algorithms are equally parallelizable
Parallelization Speedup: Ideal Completion timewith one core • If we increase the number of processors, will the speed also increase? • Yes, but (in almost all cases) only up to a point Numberssorted persecond Completion timewith n cores Expected Cores used
.... Amdahl's law Parallelpart • Usually, not all parts of the algorithm can be parallelized • Let f be the fraction of the algorithm that can be parallelized, and let Si be the corresponding speedup • Then Core #6 Sequentialparts Core #5 Core #4 Core #3 Core #3 Core #2 Core #2 Core #1 Core #1 Time Time Time
Amdahl's law • We are given a sequential task which is split into four consecutive parts: P1, P2, P3 and P4 with the percentages of runtime being 11%, 18%, 23% and 48% respectively. • Then we are told that P1 does not speed up, so S1 = 1, while P2 speeds up 5×, P3 speeds up 20×, and P4 speeds up 1.6×. • New sequential running time is:
Amdahl's law • Or a little less than 1⁄2 the original running time • The overall speed boost is 1 / 0.4575 = 2.186, or a little more than double the original speed.
Is more parallelism always better? Ideal Numberssorted persecond • Increasing parallelism beyond a certain point can cause performance to decrease! • Example: Need to send a message to each core to tell it what to do. Messages back and forth Sweetspot Expected Reality (often) Cores
Parallelization • What size of task should we assign to each core? • Frequent coordination creates overhead • Need to send messages back and forth, wait for other cores... • Result: Cores spend most of their time communicating • Bad: Ask each core to sort three numbers • Good: Ask each core to sort a million numbers