130 likes | 298 Views
Proving Correctness and Measuring Performance. CET306 Harry R. Erwin University of Sunderland. Texts. Clay Breshears (2009) The Art of Concurrency: A Thread Monkey's Guide to Writing Parallel Applications , O'Reilly Media.
E N D
Proving Correctness and Measuring Performance CET306 Harry R. Erwin University of Sunderland
Texts • Clay Breshears (2009) The Art of Concurrency: A Thread Monkey's Guide to Writing Parallel Applications, O'Reilly Media. • Mordechai Ben-Ari (2006) Principles of Concurrent and Distributed Programming, Second edition, Addison-Wesley.
Verification of Parallel Algorithms (Ben-Ari) • Programs are the execution of atomic statements • Concurrent programs are interleavings of atomic statements from two or more threads • To prove or verify a property of a concurrent program, we must show the property holds for all possible orders of execution. • All statements from any thread must eventually execute (fairness).
The Critical Section Problem • Accesses—read or write—must be restricted to allow only a single thread to execute code in the critical section. • Solutions have to show the code: • Enforces mutual exclusion • Freedom from deadlock. (If some processes are trying to enter their critical sections, one must eventually succeed.) • Freedom from (individual) starvation. (If a process tries to enter its critical section, it must eventually succeed.) • We will explore Dekker’s algorithm for two threads. • This is pseudocode.
Approach • You need a synchronisation mechanism. This consists of a preprotocol before the critical section and a postprotocol after the critical section. • Variables used in the pre- and postprotocol are not used elsewhere. • No deadlock within the critical section. • No constraints in the non-critical code, including terminationor infinite loops. • Good solutions are efficient.
Ben-Ari Discussion of the Critical Section Problem • Instructor’s slides for this text are restricted and are not available to the students. • The material covered is available in both texts. • Look up Peterson’s Algorithm.
These Conditions Lead to Deadlock • Mutual exclusion condition • Individual resources are either available or they are held by no more than one thread at a time. • Hold and wait condition • Threads that are already holding some resources may attempt to hold new resources. • No preemption condition • Once a thread is holding a resource, that resource can only be removed when the holding thread voluntarily releases the resource. • Circular wait condition • A circular chain of threads requesting resources that are held by the next thread in the chain can exist. • To prevent the possibility of deadlock from occurring, one of these conditions must not exist. A text on operating systems theory will have a discussion about deadlock.
Measuring Performance • You’re concerned about two things after correctness: • How fast? • How efficient? • Elapsed time measurements tell you whether your concurrent implementation is better than the serial one. • The ratio of the two is the speed-up factor. Report serial/parallel as a multiplier. • Don’t cheat—use the best version of the algorithm in each case.
Number of Cores • If you can control the number of cores in use, show the dependence of the ratio on that. • There will usually be diminishing returns. That relates to scalability. • If you don’t get diminishing returns, investigate why. It’s usually an error. • The usual cause is that the data fit into the available cache when shared among cores.
Amdahl’s Law • Read http://csrl.unt.edu/~kavi/CSCE6610/AmdahlLawMulticore-computer-2008.pdf • Speedup = 1/((1-P)+(P/S)) • P is the proportion of the code that can be parallelised. • S is the speedup for the parallelisable code. • Only if P=1.0, will the speedup be what is predicted from S. • An infinite number of cores still produces a finite run time.
Gustafson-Barsis’s Law • This law says you can’t do as well as Amdahl’s law due to the data overhead per core. That moves some of the parallelised code into the category of unparallelised code.
Efficiency • You can rarely use N>1 cores as efficiently as 1 core. • Report speedup/cores as the efficiency. • Replace cores with number of threads if that is larger.
Next Week • Eight Simple Rules for Designing Multithreaded Applications