290 likes | 302 Views
Learn the basics of concurrency, synchronization techniques, concurrent data structures, and more in concurrent programming. Understand the advantages and disadvantages of concurrent programs and explore Amdahl's Law.
E N D
Techniques and Structures in Concurrent Programming Wilfredo Velazquez
Outline • Basics of Concurrency • Concepts and Terminology • Advantages and Disadvantages • Amdahl’s Law • Synchronization Techniques • Concurrent Data Structures • Parallel Correctness • Treading A.P.I.’s
Basics of Concurrency • A concurrent program is any in which two or more of its modules or sections are run either by a separate process, or by another thread • Not much attention given historically • Concurrent programs are much more difficult to reason about and implement • Physical limits of modern processors are being reached, Moore’s Law no longer applies • Instead of faster processors, use more of them
Concepts and Terminology • Process • A ‘program’, which has its own memory space, stack, etc. • Difficult to communicate between processes –Message Passing Communication • Thread • A ‘sub-program’ • Threads share all program features with that of their parent process. That is to say, same memory space, stack, etc. • Easy to communicate between threads –Shared Memory Communication
Concepts and Terminology • Concurrent Program • Processes/threads which execute tasks in an ordering relative to each-other that is not defined • Essentially covers all multi-process/multi-threaded programs • Parallelism • Processes/threads that execute completely simultaneously • Parallelism is more readily applied to sections of a program • Impossible in single-core processors (those still exist?) • Increased parallelism = more processors used • Atomic action • An action (instruction) that either happens, completely without interruption, or not at all • For many purposes, the idea that an action ‘looks’ atomic is enough to classify it as such
Advantages and Disadvantages • Advantages: • Concurrent Programs + More Processors = Faster Programs • Some problems more easily described in parallel environments • General Multitasking • Non-Determinism • Disadvantages • Concurrent Programs + Few Processors = Slower Programs • Most problems more difficult to implement in parallel environments • Non-Determinism
Amdahl’s Law • Relates the speed-up of a program when more processors are added • Has very limiting implications
Outline • Basics of Concurrency • Synchronization Techniques • Mutual Exclusion and Locks • The MightyC.A.S. • Lock-free and Wait-free Algorithms • Transactional Algorithms • Concurrent Data Structures • Treading A.P.I.’s
Synchronization Techniques • These are techniques that assure program correctness in areas where the non-determinism inherited from a concurrent environment would cause undesirable behavior • Example: Let T1 and T2 be threads, x be a shared variable between them • x = 0; //initially • T1::x++; • T2::x++; • Value of x ?
Synchronization Techniques x++ becomes read x; add 1; write x; So T1 and T2’s instructions could occur in the following order: T1::read x //reading 0 T2::read x //reading 0 T1::add 1 //0+1 T2::add 1 //0+1 T1::write x //writing 1 T2::write x //writing 1
Mutual Exclusion and Locks • Algorithm that allows only one thread to execute a certain ‘area’ of code at a time • It essentially ‘locks out’ all other threads from accessing the area, thus ‘mutex’ and ‘lock’ are typically used synonymously • Varying algorithms exist for implementation, differing in robustness and performance • Typically easy to reason about their use • High overhead compared to other synchronization techniques • Can cause problems such as Deadlock, Livelock, and Starvation
The Mighty C.A.S. • Compare And Swap • Native instruction on many modern multiprocessors • Widely used in synchronizing threads • Cheap, compared to using locking algorithms • Expensive, compared to loading-storing as uses a hardware lock • ABA > CAS boolean CAS(memoryLocation, old, new) { If(*memoryLocation == old) { *memoryLocation = new; return true; } return false; }
Lock-Free and Wait-Free Algorithms • Wait-Free Algorithm • An algorithm is defined to be ‘wait-free’ if it guarantees that for any number of threads, all of them will make progress in a finite number of steps • Deadlock-free, Livelock-free, Starvation-free • Lock-Free Algorithm • An algorithm is defined to be ‘lock-free’ if it guarantees that for any number of threads, at least one will make progress in a finite number of steps • Deadlock-free, Livelock-free • All wait-free algorithms are also lock-free, though not vice versa • Note that neither definition actually forbids the use of locks, thus a lock-free algorithm could be implemented with locks
Transactional Algorithms • Inspired by database systems • Gather data from memory locations (optional) • Make local changes to the locations • Commit changes to the actual locations as an atomic step • If commit fails (another transaction occurred), start again • Essentially a generalization of CAS, except that no prior knowledge of the data is needed (for CAS we needed an ‘expected’ value)
Outline • Basics of Concurrency • Synchronization Techniques • Concurrent Data Structures • Safety and Liveliness Properties • Differing Semantics • Treading A.P.I.’s
Concurrent Data Structures • In sequential programming, data structures are invaluable as programming abstractions as they: • Provide abstraction of the inner-workings via interfaces • Provide a set of properties and guarantees as per what happens when certain operations are performed • Increase modularity of code • In concurrent programming they provide similar benefits, in addition to: • Allows threads to communicate in a simple and maintainable manner • Can be used as a focal point for the work done by multiple threads
Safety and Liveliness Properties • Safety • Assures that ‘nothing bad will happen’, for example, two calls to the ‘push’ function of a stack should result in two elements being added to the stack • Liveliness • Assures that progress continues • Deadlock • Livelock • Starvation • All bad!
Differing Semantics • Structures must share properties and guarantees with the sequential versions which they mimic, thus their operations must be deterministic (with a few exceptions) • Semantics of use and implementation differ greatly purely due to the concurrent environment • Example: • The result obtained from popping the stack is non-deterministic, even though the implementation of the interfaces themselves are deterministic
Differing Semantics • So how can we write the program in such a way that it is well-behaved for our purposes? • De-Facto standard: Use a lock • Parallelism suffers, as other threads may not operate at all during the entire given section of code • Introduces liveliness problems
Constructing Concurrent Data Structures • A concurrent data structure must abide by its sequential counter-part’s properties and guarantees when operations are performed on it • It must be ‘thread-safe’, no matter how many parallel calls are made to it, the data structure will never be corrupted • It should be free from any liveliness issues such as Deadlock • Just as sequential ones are constructed for abstraction, concurrent data structures should be opaque in their implementation
Constructing Concurrent Data Structures • The sequential version of this data structure • Not suitable as-is for concurrent programming • Lacks any safety properties, though it has no liveliness issues • How can we resolve the issue? • Lock it
Constructing Concurrent Data Structures • Safety is no longer a concern, though liveliness now is • Deadlock possible should a thread die during execution • Starvation in case of an interrupt • Lock overhead will overwhelm applications with many pops/push • Look back to original implementation; What sequential assumptions were made? (push)
Constructing Concurrent Data Structures • Correct, but original property lost: pushing on to a stack does not always place the element on the stack • Easy solution: Keep trying
Constructing Concurrent Data Structures • Pop implemented using the same logic:
Outline • Basics of Concurrency • Synchronization Techniques • Concurrent Data Structures • Treading A.P.I.’s • pthreads • M.C.A.S., W.S.T.M., O.S.T.M.
Threading API’s • pthreads • C library for multithreading. Contains utilities such as mutexes, semaphores, and others • Available on *nix platforms, though subset ports exist for windows • MCAS • A C API that allows the use of a software-built MCAS (Multiple-Compare-And-Swap) function • Very powerful, though larger overhead than CAS • WSTM • Word-Based Software Transactional Memory • API for easy use of the Transactional Model • Mixes normal objects with WSTM datatypes • Easy to implement on existing systems • OSTM • Object-Based Software Transactional Memory • Similar to WSTM, except that it is more streamlined in its implementation due to operating exclusively on its own data types • More difficult to implement on existing systems
Refferences • Concurrent Programming Without Locks • http://research.microsoft.com/en-us/um/people/tharris/papers/2007-tocs.pdf • MCAS, WSTM, OSTM implemented in paper • The art of Pultiprocessor Programming • By Maurice Herlihy, NirShavit • http://books.google.com/books?id=pFSwuqtJgxYC&printsec=frontcover#v=onepage&q&f=false • DCAS is not a Silver Bullet for Nonblocking Algorithm Design • http://labs.oracle.com/scalable/pubs/SPAA04.pdf