590 likes | 725 Views
Wait-Free Linked-Lists. Shahar Timnat , Anastasia Braginsky , Alex Kogan, Erez Petrank Technion , Israel Presented by Shahar Timnat. 4. 6. 9. -∞. +∞. Our Contribution. A fast, wait-free linked-list The first wait-free list fast enough to be used in practice . Agenda.
E N D
Wait-Free Linked-Lists Shahar Timnat, Anastasia Braginsky, Alex Kogan, Erez Petrank Technion, Israel Presented by Shahar Timnat 4 6 9 -∞ +∞
Our Contribution • A fast, wait-free linked-list • The first wait-free list fast enough to be used in practice
Agenda • What is a wait-free linked-list? • Related work and existing tools • Wait-Free Linked-List design • Performance
Concurrent Data Structures • Allow several threads to read or modify the data-structure simultaneously • Increasing demands due to highly-parallel systems
Progress Guarantees • Obstruction Free – A thread running exclusively will make a progress • Lock Free – At least one of the running threads will make a progress • Wait Free – every thread that gets the CPU will make a progress.
Wait Free Algorithms • Provides the strongest progress guarantee • Always desirable, particularly in real-time systems. • Relatively rare • Hard to design • Typically slower
The Linked List Interface • Following the traditional choice; a sorted list-based set of integers insert(intx); delete(intx); contains(intx); 4 6 9 -∞ +∞
Prior Wait-Free Lists • Only Universal Constructions • Non-scalable (by nature ?) • Achieve good complexity, but poor performance • State-of-the-art construction (Chuong, Ellen, Ramachandran) significantlyunder-perform our construction.
Linked-Lists with Progress Guarantee • No practical wait-free linked-lists available • Lock-free linked-lists exists • Most notably: Harris’s linked-list
4 6 9 4 6 9 Existing Lock-Free List(by Harris) • Deletion in two steps • Logical: Mark the next field using a CAS • Physical: Remove the node
4 6 9 4 6 9 Existing Lock-Free List(by Harris) • Use the least significant bit in each next field, as a mark bit • The mark bit signals that a node is logically deleted • The Node’s next field cannot be changed (the CAS will fail) if it is logically deleted
Help Mechanism • A common technique to achieve wait-freedom • Each thread declares in a designated state array the operation it desires • Many threads may attempt to execute it
Help Mechanism - Difficulties • Multiple threads should be able to work concurrently on the same operation • Many potential races • Difficult to design • Usually slower
Complication: Deletion Owning • T1, T2 both attempt delete(6) 4 6 9 -∞ +∞
Complication: Deletion Owning • T1, T2 both attempt delete(6) • T1, T2 both declare in the state array 4 6 9 -∞ +∞
Complication: Deletion Owning • T1, T2 both attempt delete(6) • T1, T2 both declare in the state array • T3 sees T1 declaration and tries to help it, while T4 helps T2 4 6 9 -∞ +∞
Complication: Deletion Owning • T1, T2 both attempt delete(6) • T1, T2 both declare in the state array • T3 sees T1 declaration and tries to help it, while T4 helps T2 4 6 9 -∞ +∞
Complication: Deletion Owning • If both helpers T3, T4 “go to sleep” after the mark was done, which thread (T1 or T2) should return true and which false? 4 6 9 -∞ +∞
"Solution: use a “success bit • Each node holds an extra “success bit” (initially 0) • Potential owners compete to CAS it to 1 (no help in this part) • Note the node is deleted before it is decided which thread owns its deletion
Helping an Insert Operation • Search • Direct • Insert • Report
4 6 9 7 Helping an Insert Operation • Search • Direct • Insert • Report Status: PendingOperation: Insert New node:
4 6 9 7 Helping an Insert Operation • Search • Direct • Insert • Report Status: PendingOperation: Insert New node:
4 6 9 7 Helping an Insert Operation • Search • Direct • Insert • Report Status: PendingOperation: Insert New node:
4 6 9 7 Helping an Insert Operation • Search • Direct • Insert • Report Status: PendingOperation: Insert New node: CAS
4 6 9 7 Helping an Insert Operation • Search • Direct • Insert • Report Status: PendingOperation: Insert New node: Status: SuccessOperation: Insert New node: CAS
4 6 9 7 Incorrect Result Returnedconsider 2 threads helping insert(7) T2 { found(6,7) CAS(state[tid],s,failure) } T1 { found (6,9)node.next = &9 inserts new node. CAS(state[tid],s,success) }
4 6 9 7 Incorrect Result Returnedconsider 2 threads helping insert(7) T2 { found(6,7) CAS(state[tid],s,failure) } T1 { found (6,9)node.next = &9 inserts new node CAS(state[tid],s,success) }
4 6 9 7 Incorrect Result Returnedconsider 2 threads helping insert(7) T2 { found(6,7) CAS(state[tid],s,failure) } T1 { found (6,9)node.next = &9 inserts new node CAS(state[tid],s,success) }
4 6 9 7 Incorrect Result Returnedconsider 2 threads helping insert(7) T2 { found(6,7) CAS(state[tid],s,failure) } T1 { found (6,9)node.next = &9 inserts new node. CAS(state[tid],s,success) }
4 6 9 7 Incorrect Result Returnedconsider 2 threads helping insert(7) T2 { found(6,7)CAS(state[tid],s,failure) } T1 { found (6,9)node.next = &9 inserts new node. CAS(state[tid],s,success) }
4 6 9 7 Incorrect Result Returnedconsider 2 threads helping insert(7) T2 { found(6,7)CAS(state[tid],s,failure) } T1 { found (6,9)node.next = &9 inserts new node CAS(state[tid],s,success) }
4 6 9 7 Incorrect Result Returned 2 T2 { found(6,7) CAS(->failure} T1 { found (6,9)node.next = &9 inserts new node CAS(->success) } T3 { Delete(7) Insert(7) }
4 6 9 7 Incorrect Result Returned 2 T2 { found(6,7) CAS(->failure} T1 { found (6,9)node.next = &9 inserts new node CAS(->success) } T3 { Delete(7) Insert(7) }
4 6 7 9 Incorrect Result Returned 2 T2 { found(6,7) CAS(->failure} T1 { found (6,9)node.next = &9 inserts new node CAS(->success) } T3 { Delete(7) Insert(7) }
4 6 7 9 Incorrect Result Returned 2 T2 { found(6,7) CAS(->failure} T1 { found (6,9)node.next = &9 inserts new node CAS(->success) } T3 { Delete(7) Insert(7) }
7’ 4 6 7 9 Incorrect Result Returned 2 T2 { found(6,7) CAS(->failure} T1 { found (6,9)node.next = &9 inserts new node CAS(->success) } T3 { Delete(7) Insert(7) }
7’ 4 6 7 9 Incorrect Result Returned 2 T2 { found(6,7) CAS(->failure} T1 { found (6,9)node.next = &9 inserts new node CAS(->success) } T3 { Delete(7) Insert(7) }
7’ 4 6 7 9 Incorrect Result Returned 2 T2 { found(6,7)CAS(->failure} T1 { found (6,9)node.next = &9 inserts new node CAS(->success) } T3 { Delete(7) Insert(7) }
7’ 4 6 7 9 Incorrect Result Returned 2 T2 { found(6,7)CAS(->failure} T1 { found (6,9)node.next = &9 inserts new node CAS(->success) } T3 { Delete(7) Insert(7) }
4 6 9 7 Ill-timed Directconsider 2 threads helping insert(7) T2 { found (6,9)node.next = &9 inserts the new node CAS(->success) . ..Insert(8) (after 7) } T1 { found (6,9) node.next = &9}
4 6 9 7 Ill-timed Directconsider 2 threads helping insert(7) T2 { found (6,9)node.next = &9 inserts the new node CAS(->success) . ..Insert(8) (after 7) } T1 { found (6,9) node.next = &9}
4 6 9 7 Ill-timed Directconsider 2 threads helping insert(7) T2 { found (6,9)node.next = &9 inserts the new node CAS(->success) . ..Insert(8) (after 7) } T1 { found (6,9) node.next = &9}
4 6 9 7 Ill-timed Directconsider 2 threads helping insert(7) T2 { found (6,9)node.next = &9inserts the new node CAS(->success) . ..Insert(8) (after 7) } T1 { found (6,9) node.next = &9}
4 6 9 7 Ill-timed Directconsider 2 threads helping insert(7) T2 { found (6,9)node.next = &9inserts the new nodeCAS(->success) . ..Insert(8) (after 7) } T1 { found (6,9) node.next = &9}
4 6 8 7 9 Ill-timed Directconsider 2 threads helping insert(7) T2 { found (6,9)node.next = &9inserts the new nodeCAS(->success) ...Insert(8) (after 7) } T1 { found (6,9) node.next = &9}
4 6 8 7 9 Ill-timed Directconsider 2 threads helping insert(7) T2 { found (6,9)node.next = &9inserts the new nodeCAS(->success) ...Insert(8) (after 7) } T1 { found (6,9) node.next = &9}
More Races Exist • Additional races were handled in both the delete and insert operations • We constructed a formal proof for the correctness of the algorithm
Main Invariant • Each modification of a node’s next field belongs into one of four categories • Marking (change the mark bit to true) • Snipping (removing a marked node) • Redirection (of an infant node) • Insertion (a non-infant to an infant) • Proof by induction and by following the code lines
Fast-Path-Slow-Path(Kogan and Petrank, PPOPP 2012) • Each thread: • Tries to complete the operation without help • Asks For help Only if it failed due to contention • (Almost) as fast as the lock-free • Gives the stronger wait-free guarantee