240 likes | 257 Views
Learn about compiler support for atomic sections using pessimistic concurrency to reduce contention and avoid deadlocks in concurrent programming. Discover different strategies and challenges in implementing fine-grain locks for better parallelism.
E N D
Inferring Locks for Atomic Sections Sigmund Cherem Trishul Chilimbi Sumit Gulwani Cornell University (summer intern at Microsoft Research) Microsoft Research Microsoft Research
What Is This Talk About? • Multi-cores widely available • Developing concurrent software is not trivial • Many challenges: parallelization, synch., isolation • Manual locking is error prone, non compositional • Recent proposal: atomic sections • Raising the level of abstraction, is compositional • Optimistic (transactions) implementations [Herlihy, Moss ISCA’93; Hammond et al. ISCA’04] [Shavit, Touitou PDC’95; Dice et al. DISC’06; Fraser, Harris TOPLAS’07] • Limitations: non-reversible ops, overhead • This talk: compiler support for atomic sections via pessimistic concurrency Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Static Lock Inference Framework • Compiler support for atomic sections based on pessimistic concurrency • Prevent conflicts using locks, no deadlocks • Goal: reduce contention while avoiding deadlocks Lock Inference Compiler Concurrent program with atomic sections (runs on STM) Same program with locks for implementing atomic sections • Specifies “where”, but not “how” • Lightweight runtime support (locking library) • Automatically supports non-reversible ops. Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Moving List Elements move (list* to, list* from) { atomic { elem* x = to->head; elem* y = from->head; from->head = null; … while (x->next != null) { x = x->next; } x->next = y; } } to from head head x y Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Moving List Elements move (list* to, list* from) { atomic{ elem* x = to->head; elem* y = from->head; from->head = null; … while (x->next != null) { x = x->next; } x->next = y; } } to from head x y Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Attempt 1: Global Lock move (list* to, list* from) { elem* x = to->head; elem* y = from->head; from->head = null; … while (x->next != null) { x = x->next; } x->next = y; } to from acquire( GLOBAL ); head x y Global lock protects entire memory Problem with Attempt 1: No parallelism with any other atomic sections release( GLOBAL ); Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Attempt 2: Fine-Grain Locks A fine-grain lock protects an individual memory address move (list* to, list* from) { elem* x = to->head; elem* y = from->head; from->head = null; … while (x->next != null) { x = x->next; } x->next = y; releaseAll(); } to from acquire( &(to->head) ); head acquire( &(from->head) ); x y … acquire( &(x->next) ); acquire( &(x->next) ); Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Attempt 2: Fine-Grain Locks move(a, b) move(b, a) | | move (list* to, list* from) { elem* x = to->head; elem* y = from->head; from->head = null; … while (x->next != null) { x = x->next; } x->next = y; releaseAll(); } to from acquire( &(to->head) ); head acquire( &(from->head) ); x y Problem with Attempt 2: may lead to deadlock acquire( &(x->next) ); acq(&(a->head) ); // deadlock here acq(&(b->head) ); acquire( &(x->next) ); acq(&(b->head) ); acq(&(a->head) ); Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Attempt 3: Fine-Grain Locks at Entry move (list* to, list* from) { while (x->next != null) { x = x->next; } x->next = y; releaseAll(); } to from acquireAll({ } ); acquire( &(to->head) ); elem* x = to->head; head acquire( &(from->head) ); elem* y = from->head; from->head = null; x y … acquire( &(x->next) ); Challenge #1: Protect locations ahead of time (at entry of atomic), i.e., find which addresses will be used inside atomic acquire( &(x->next) ); Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Protect when Entering Atomic Block • Find corresponding expressions Acquire a lock for each shared location accessed within the atomic section, expressed in terms of expressions valid at the entry of the atomic block atomic { list* x = y[5]; list* d = x; d->head = NULL; } acquire( &(y[5]->head) ) acquire( &(x->head) ) acquire( &(d->head) ) Contribution #1: Identifying appropriate fine-grain locks at entry (via inter-procedural backward data-flow analysis) Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Attempt 3: Fine-Grain at Entry move (list* to, list* from) { acquireAll({ } ); elem* x = to->head; elem* y = from->head; from->head = null; … while (x->next != null) { x = x->next; } x->next = y; releaseAll(); } to from &(to->head) &(from->head) head &(to->head->next) head Problem with Attempt 3: Can’t protect unbounded number of locations Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Attempt 4: Multi-Grain Locks at Entry A coarse-grain lock protects a set of memory locations move (list* to, list* from) { acquireAll({ } ); elem* x = to->head; elem* y = from->head; from->head = null; … while (x->next != null) { x = x->next; } x->next = y; releaseAll(); } to from head head Challenge #2: Mixing locks of multiple granularities while avoiding deadlocks Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Defining Multi-Grain Locks • A fine-grain lock protects a single location • A coarse-grain lock protects a set of locations • Any traditional heap abstraction can be used to define coarse-grain locks • E.g. types, points-to sets, shape abstractions • Our compiler framework is parameterized • Clients can specify the kind locks they want to use Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Mixing Locks of Multiple Granularities Can’t be held concurrently Memory locations • Borrow Database’s locking protocol based on intention locks [Gray ’76] Global lock Coarse-grain locks Fine-grain locks Contribution #2: We allow mixing locks of multiple granularities and avoid deadlocks Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Soundness Results • Sound locking structure provided • Protected by child is alsoprotected by parent • Map of expressions to locks • Bounded (for termination) • Soundness Theorem • Compiler chooses set of locks protecting all memory accesses within atomic block * … &(*->next) &(to->head->next) &(to->head->next->next) Contribution #3: Framework is sound (for any sound lock structure instantiation) Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Experimental Evaluation • Lock structure instance: 3-level locks + effects • Experiments • Concurrent data-structures: rb-tree, hashtable • Concurrent get (read-only), put, and remove operations • 1.86Gz Intel Xeon dual-quad core machine Global lock rw Points-to set locks [Steensgard’s ’96] ro … rw Expression locks (limited in size) ro Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Scalability Results Global lock TL2 STM [Dice et al. DISC’06] Only coarse-grain locks Coarse + fine-grain locks 70 60 50 40 30 20 10 0 Execution time (sec) 1 2 3 4 5 6 7 8 Number of threads Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
TH (rb-tree + hash w/rehash): 80% gets Global lock TL2 STM [Dice et al. DISC’06] Only coarse-grain locks Coarse + fine-grain locks 70 60 50 40 30 20 10 0 Global lock (exclusive) doesn’t scale Execution time (sec) Scalability comparable to STM Compiler didn’t use fine-grain locks 1 2 3 4 5 6 7 8 Number of threads Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
TH (rb-tree + hash w/rehash): 80% puts Global lock TL2 STM [Dice et al. DISC’06] Only coarse-grain locks Coarse + fine-grain locks High contention from re-hashing degrades STM performance 70 60 50 40 30 20 10 0 Execution time (sec) 2 coarse-grain (exclusive) locks are better than a single global lock 1 2 3 4 5 6 7 8 Number of threads Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
simple-hashtable: 80% gets Global lock TL2 STM [Dice et al. DISC’06] Only coarse-grain locks Coarse + fine-grain locks 45 40 35 30 25 20 15 10 5 0 Compiler didn’t use fine-grain locks for gets Execution time (sec) STM allows put and get concurrently 1 2 3 4 5 6 7 8 Number of threads Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
simple-hashtable: 80% puts Global lock TL2 STM [Dice et al. DISC’06] Only coarse-grain locks Coarse + fine-grain locks 45 40 35 30 25 20 15 10 5 0 Compiler uses fine-grain locks for puts Execution time (sec) 1 2 3 4 5 6 7 8 Number of threads Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Differences with Recent Work • No programmer annotations (other than atomic) • Autolocker [McCloskey et al POPL’06] requires programmer annotations to choose appropriate granularity • Moving fine-grain lock acquisitions to entry of atomic • Acquiring fine-grain locks right before first use [Hindman, Grossman MSPC‘06] is not fully pessimistic • may generate deadlocks and need rollbacks • Multi-grain locks without deadlocks • Several pessimistic approaches use coarse-grained locks only [Hicks et al ’06; Halpert et al. ’07; Emmi et al.’07] Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani
Conclusions and Future Work • Lock inference framework for atomic sections • Multi-grain locks to reduce contention and avoid deadlocks • Soundness: accesses are protected, atomicity preserved • Validation: resulting performance depends on application • Locks preferable for non-reversible ops. or high-contention • Future directions • Better locking hierarchy instantiations (e.g. ownership) • Optimizations (e.g. delay lock acquisitions) • Hybrid systems (e.g. compiler support to optimize STMs) Inferring Locks for Atomic Sections | Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani