1 / 20

Wait-Free Reference Counting and Memory Management

Wait-Free Reference Counting and Memory Management. Håkan Sundell , Ph.D. Outline. Shared Memory Synchronization Methods Memory Management Garbage Collection Reference Counting Memory Allocation Performance Conclusions. Shared Memory. CPU. CPU. CPU. Cache. Cache. Cache. Memory.

thuong
Download Presentation

Wait-Free Reference Counting and Memory Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Wait-Free Reference Counting and Memory Management Håkan Sundell , Ph.D.

  2. Outline • Shared Memory • Synchronization Methods • Memory Management • Garbage Collection • Reference Counting • Memory Allocation • Performance • Conclusions IPDPS 2005

  3. Shared Memory CPU CPU . . . CPU Cache Cache Cache Memory - Uniform Memory Access (UMA) ... ... ... CPU CPU CPU CPU CPU CPU . . . Cache bus Cache bus Cache bus Memory Memory Memory - Non-Uniform Memory Access (NUMA) IPDPS 2005

  4. Synchronization • Shared data structures needs synchronization! • Accesses and updates must be coordinated to establish consistency. P1 P2 P3 IPDPS 2005

  5. Hardware Synchronization Primitives • Weak • Atomic Read/Write • Stronger • Atomic Test-And-Set (TAS), Fetch-And-Add (FAA), Swap • Universal • Atomic Compare-And-Swap (CAS) • Atomic Load-Linked/Store-Conditionally Read Read Write M=f(M,…) IPDPS 2005

  6. Mutual Exclusion • Access to shared data will be atomic because of lock • Reduced Parallelism by definition • Blocking, Danger of priority inversion and deadlocks. • Solutions exists, but with high overhead, especially for multi-processor systems P1 P2 P3 IPDPS 2005

  7. Non-blocking Synchronization • Perform operation/changes using atomic primitives • Lock-Free Synchronization • Optimistic approach • Retries until succeeding • Wait-Free Synchronization • Always finishes in a finite number of its own steps • Coordination with all participants IPDPS 2005

  8. Memory Management • Dynamic data structures need dynamic memory management • Concurrent D.S. need concurrent M.M.! IPDPS 2005

  9. Concurrent Memory Management • Concurrent Memory Allocation • i.e. malloc/free functionality • Concurrent Garbage Collection • Questions (among many): • When to re-use memory? • How to de-reference pointers safely? P2 P1 P3 IPDPS 2005

  10. Lock-Free Memory Management • Memory Allocation • Valois 1995, fixed block-size, fixed purpose • Michael 2004, Gidenstam et al. 2004, any size, any purpose • Garbage Collection • Valois 1995, Detlefs et al. 2001; reference counting • Michael 2002, Herlihy et al. 2002; hazard pointers IPDPS 2005

  11. Wait-Free Memory Management • Hesselink and Groote, ”Wait-free concurrent memory management by create and read until deletion (CaRuD)”, Dist. Comp. 2001 • limited to the problem of shared static terms • New Wait-Free Algorithm: • Memory Allocation – fixed block-size, fixed purpose • Garbage Collection – reference counting IPDPS 2005

  12. Wait-Free Reference Counting • De-referencing links • 1. Read the link contents, i.e. a pointer. • 2. Increment (FAA) the reference count on the corresponding object. • What if the link is changed between step 1 and 2? • Wait-Free solution: • The de-referencing operation should announce the link before reading. • The operations that changes that link should help the de-referencing operation. IPDPS 2005

  13. Wait-Free Reference Counting • Announcing • Writes the link adress to a (per thread and per new de-ref) shared variable. • Atomically removes the announcement and retrieves possible answer (from helping) by Swap with null. • Helping • If announcement matches changed link, atomically answer with a proper pointer using CAS. IPDPS 2005

  14. Wait-Free Memory Allocation • Solution (lock-free), IBM freelists: • Create a linked-list of the free nodes, allocate/reclaim using CAS • How to guarantee that the CAS of a alloc/free operation eventually succeeds? Allocate … Head Mem 1 Mem 2 … Mem i Reclaim Used 1 IPDPS 2005

  15. Wait-Free Memory Allocation • Wait-Free Solution: • Create 2*N freelists. • Alloc operations concurrently try to allocate from the current (globally agreed on) freelist. • When current freelist is empty, the current is changed in round-robin manner. • Free operation of thread i only works on freelist i or N+i. • Alloc operations announce their interest. • All free and alloc operations try to help announced alloc operations in round-robin. IPDPS 2005

  16. Wait-Free Memory Allocation CAS! SWAP! X X • Announcing • A value of null in the per thread shared variable indicates interest. • Alloc atomically announces and recieves possible answer by using Swap. … Announcement variables Null X Null X Null Null id • Helping • Globally agreed on which thread to help, incremented when agreed in round-robin. • Free atomically answers the selected thread of interest with a free node using CAS. • First time that Alloc succeeds with getting a node from the current freelist, it tries to atomically answer the selected thread of interest with the node using CAS. IPDPS 2005

  17. Performance • Worst-case • Need analysis of maximum execution path and apply known WCET techniques. • e.g. 2*N2 maximum CAS retries for alloc. • Average and Overhead • Experiments in the scope of dynamic data structures (e.g. lock-free skip list) • H. Sundell and P. Tsigas, ”Fast and Lock-Free Concurrent Priority Queues for Multi-thread Systems”, IPDPS 2003 • Performed on NUMA (SGI Origin 2000) architecture, full concurrency. IPDPS 2005

  18. Average Performance IPDPS 2005

  19. Conclusions • New algorithms for concurrent & dynamic Memory Management • Wait-Free & Linearizable. • Reference counting. • Fixed-size memory allocation. • To the best of knowledge, the first wait-free memory management scheme that supports implementing arbitrary dynamic concurrent data structures. • Will be available as part of NOBLE software library, http://www.noble-library.org • Future work • Implement new wait-free dynamic data structures. • Provide upper bounds of memory usage. IPDPS 2005

  20. Questions? • Contact Information: • Address: Håkan Sundell Computing Science Chalmers University of Technology • Email: phs@cs.chalmers.se • Web: http://www.cs.chalmers.se/~phs IPDPS 2005

More Related