200 likes | 401 Views
Reactive Spin-locks: A Self-tuning Approach. Phuong Hoai Ha Marina Papatriantafilou Philippas Tsigas. I-SPAN ’05, Las Vegas, Dec. 7 th – 9 th , 2005. Outline. Mutual exclusion Overhead Available reactive spin-locks New reactive spin-lock Model Algorithm Evaluation Conclusions.
E N D
Reactive Spin-locks: A Self-tuning Approach Phuong Hoai Ha Marina Papatriantafilou Philippas Tsigas I-SPAN ’05, Las Vegas, Dec. 7th – 9th, 2005
Outline • Mutual exclusion • Overhead • Available reactive spin-locks • New reactive spin-lock • Model • Algorithm • Evaluation • Conclusions I-SPAN '05
Mutual exclusion Noncritical sec. Entry section Critical section Exit section • Performance goals: • Low latency • Low contention • … Requests issued Lock released Arbitration Lock sent to winner I-SPAN '05
Spin-lock categories • Arbitrating locks: • Determine who is the next lock-holder in advance, e.g. ticket-locks, queue-locks. • Advantages: • Prevent processors from causing bursts in network traffic and high contention on the lock. • Non-arbitrating locks: • E.g. Test-and-set locks • Advantages: • Exploit locality/cache • Tolerate failures in the Entry section. I-SPAN '05
Arbitrating vs. non-arbitrating locks 1 3 5 Interconnection Network Interconnection Network 2 4 6 I-SPAN '05
Available reactive spin-lock algorithms • Drawbacks: • Their reactive schemes rely on • Fixed experimental thresholds • The thresholds frequently become inappropriate in variable and unpredictable environments like multiprogramming systems • E.g. ticket locks with proportional backoff, test-and-test-and-set locks with exponential backoff • Known probability distributions of some inputs • The assumption is not usually feasible. I-SPAN '05
New reactive spin-lock algorithm • Ideas • A non-arbitrating lock with adaptive sensible backoff delay. • Advantages • Its reactive scheme is self-tuning • Neither experimentally tuned thresholds nor probability distributions of inputs are needed • It combines advantages of both arbitrating and non-arbitrating spin-lock categories. • It can exploit locality as well as reduce contention on the lock. I-SPAN '05
Find sensible backoff delay • Need to optimize trade-off between: • Latency • The interval between a pair of lock-release and lock-acquisition • Contention on the lock • This is an online problem. delay=? Load on the lock I-SPAN '05
Increase delay only when the load on lock is the highest so far, • When increasing delay, increase just enough to keep the competitive ratio c = P - (P-1)/P1/(P-1) Reactive scheme • Bounds for loads on the lock: 1 lt P • During a load-rising phase: • Similar for load-dropping phase • In each load-rising/load-dropping phase, the reactive scheme is competitive with competitive ration c=(ln(P)) I-SPAN '05
Algorithm • The algorithm guarantees mutual exclusion and non-livelock. Its space complexity is log(P). 0 1 3 4 2 0 1 Interconnection Network 3 2 I-SPAN '05
Evaluation • Benchmarks • Spark98 kernel: lmv • SPLASH-2 suite: Volrend and Radiosity • Representatives: • Arbitrating: ticket lock with (tuned) proportional backoff • Non-arbitrating: test-and-test-and-set lock with (tuned) exponential backoff • System • A ccNUMA SGI Origin2000 with 28 250MHz MIPS R1000 processors. I-SPAN '05
Experimental results I-SPAN '05
Experimental results (2) I-SPAN '05
Experimetal results (3) I-SPAN '05
Conclusions • We have designed and implemented a new reactive spin-lock: • It is self-tuning. • It combines advantages of both arbitrating and non-arbitrating locks • Its reactive scheme is competitive with c= (ln(P)) The lock automatically adjusts its backoff delay reasonably according to loads on the lock as well as applications I-SPAN '05
Estimate delay bases • Fairness • A fair lock helps parallel application gain performance since the application threads can execute their non-critical section in parallel. • Definition: • Heuristic to estimate basel , where ni is #lock-acquisitions of a processor in t and N is #processors , where a, b are system documented constants and DoCS is the delay outside CS I-SPAN '05
NUMA • Another parameter that makes the problem harder is NUMA • Latency is much different • E.g. ccNUMA SGI Origin2000 I-SPAN '05
Model: An online problem • A sequence of loads on the lock are unfolded on-the-fly. • When observing a load, the algorithm must decide how much its current backoff delay should be lengthened. • If increasing delay too soon, it will waste time on a long delay when the lock becomes available • If not increasing delay in time, it will cause high contention on the lock it must increase delay at high loads reasonably Goal is to maximize t delayt .loadt ,wheret delayt P I-SPAN '05
LockType: <lock, counter> Initial delay = L.counter x basel The algorithm guarantees mutual exclusion and non-livelock. Its space complexity is log(P). Acquire( Lock pL) L = FAA(pL.L, <1,1>) if L.lock then delay = ComputeDelay(L) cond = <1,0> do sleep(delay) L = pL.L if L.lock then delay = ComputeDelay(L) continue; cond = FAA(pL.L, <1,0>) while cond.lock Release( Lock pL) do L = pL.L while not CAS(pL.L,L,<0,L.counter-1>) Algorithm I-SPAN '05