1 / 48

Efficient Techniques for Persistent Memory Programming - Challenges and Solutions

This paper explores the challenges and solutions in programming for persistent memory (PM), which allows direct manipulation of data in memory. It discusses crash consistency, overhead in storage layer, and the need for application-layer responsibility. Various programming levels, including low-level primitives and high-level transactions, are compared for their effectiveness in ensuring crash consistency. The text highlights difficulties in ensuring durability and ordering guarantees, emphasizing the complexities and nuances of PM programming.

bernardl
Download Presentation

Efficient Techniques for Persistent Memory Programming - Challenges and Solutions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PMTEST Testing Persistent Memory Applications Samira Khan

  2. TWO-LEVEL STORAGE MODEL CPU Ld/St VOLATILE MEMORY FAST DRAM BYTE ADDR FILE I/O NONVOLATILE STORAGE SLOW BLOCK ADDR

  3. TWO-LEVEL STORAGE MODEL CPU Ld/St VOLATILE MEMORY FAST DRAM BYTE ADDR NVM FILE I/O PCM, STT-RAM NONVOLATILE STORAGE SLOW BLOCK ADDR Non-volatile memories combine characteristics of memory and storage

  4. VISION: UNIFY MEMORY AND STORAGE CPU Ld/St PERSISTENTMEMORY NVM Provides an opportunity to manipulate persistent data directly in memory Avoids reading and writing back data to/from storage

  5. CHALLENGE: NEED ALL STORAGE SYSTEM SUPPORTS APPLICATION APPLICATION Crash Consistency OS/SYSTEM OS/SYSTEM Ld/St Ld/St NVM MEMORY FILE I/O PERSISTENT MEMORY STORAGE Overhead in OS/storage layer overshadows the benefit of nanosecond access latency of NVM

  6. CHALLENGE: NEED ALL STORAGE SYSTEM SUPPORTS APPLICATION APPLICATION Crash Consistency OS/SYSTEM Ld/St Ld/St NVM MEMORY FILE I/O PERSISTENT MEMORY STORAGE Not the operating system, Application layer is responsible for crash consistency in PM

  7. CHALLENGE: PM Programming is Hard! Requirements and Key Ideas NON-VOLATILE MEMORY PERSISTENT MEMORY PMTEST: Interface and Mechanism SIGMETRICS’14 Programming Persistent Memory Applications ASPLOS’19 Results and Conclusion

  8. PERSISTENT MEMORY PROGRAMMING • Support for crash consistency have two fundamental guarantees • Durability:writes become persistent in PM • Ordering:one write becomes persistent in PM before another Core • Durability Guarantee: • writeback data from cache • Flush A Volatile Cache Persistent PM-DIMM

  9. PERSISTENT MEMORY PROGRAMMING • Support for crash consistency have two fundamental guarantees • Durability:writes become persistent in PM • Ordering:one write becomes persistent in PM before another Core • Ordering Guarantee: • Write A before B • Writeback A • Barrier • Writeback B Volatile Cache Persistent B A PM-DIMM

  10. PERSISTENT MEMORY PROGRAMMING Normal Expert PM Programming • Uses low-level primitives • Understands the hardware • Understands the algorithm • Uses a high-level interface • Does not need to know details of hardware or algorithm Two different ways to program persistent applications

  11. PERSISTENT MEMORY PROGRAMMING (LOW-LEVEL) • Hardware provides low-level primitives for crash consistency • Exposes instructions for cache flush and barriers • sfence, clwbfrom x86 • dc cvapfrom ARM • Academic proposals, e.g., ofence, dfence. x86 ARM clwb sfence dc cvap dsb New Instr PM-DIMM PM-DIMM PM-DIMM [Kiln’13, ThyNVM’15, DPO’16, JUSTDOLogging’16, ATOM’17, HOPS’17, etc.]

  12. PROGRAMMING USING LOW-LEVEL PRIMITIVES 1 void listAppend(item_tnew_val) { 2  node_t* new_node = new node_t(new_val); 3  new_node->next = head; 4  head = new_node; 5  persist_barrier(); 6 } Createnew_node   2 node_t* new_node = new node_t(new_val); Updatenew_node 3 new_node->next = head; Update head pointer Writeback updates 4 head = new_node; Writes to PM can reorder 5 persist_barrier(); Head In cache new_nodeis lost after failure Inconsistent linked list

  13. PROGRAMMING USING LOW-LEVEL PRIMITIVES 1 void listAppend(item_tnew_val) { 2  node_t* new_node = new node_t(new_val); 3  new_node->next = head; Enforce writeback before changing head persist_barrier(); 4  head = new_node; 5  persist_barrier(); 6 } Head In cache In PM Ensuring crash consistency with low-level primitives is HARD! Consistent linked list

  14. PERSISTENT MEMORY PROGRAMMING Normal Expert PM Programming • Uses low-level primitives • Understands the hardware • Understands the algorithm • Uses a high-level interface • Does not need to know details of hardware or algorithm

  15. PERSISTENT MEMORY PROGRAMMING (HIGH-LEVEL) • Libraries provide transactions on top of low-level primitives • Intel’s PMDK • Academic proposals AtomicBegin { Append a new node; } AtomicEnd; Uses logging mechanisms to atomically commit the updates [NV-Heaps’11, Mnemosyne’11, ATLAS’14, REWIND’15, NVL-C’16, NVThreads’17 LSNVMM’17, etc.]

  16. PROGRAMMING USING TRANSACTIONS 1 void ListAppend(item_tnew_val) { 2 TX_BEGIN { 3 node_t *new_node = makeNode(new_val); 4 TX_ADD(list.head, sizeof(node_t*)); 5 List.head = new_node; 6 List.length++; 7 } TX_END 8 } Createnew_node backuphead Update head Update length 3 node_t *new_node = makeNode(new_val); 4 TX_ADD(list.head, sizeof(node_t*)); 5 List.head = new_node; 6 List.length++; length is not backed up before update!

  17. PROGRAMMING USING TRANSACTIONS 1 void ListAppend(item_tnew_val) { 2 TX_BEGIN { 3 node_t *new_node = makeNode(new_val); 4 TX_ADD(list.head, sizeof(node_t*)); 5 List.head = new_node; TX_ADD(list.length, sizeof(unsigned)); 6 List.length++; 7 } TX_END 8 } Backup length before update Ensuring crash consistency with transactions is still HARD!

  18. PERSISTENCE MEMORY PROGRAMMING IS HARD Normal Expert PM Programming • Uses low-level primitives • Understands the hardware • Understands the algorithm • Uses a high-level interface • Does not need to know details of hardware or algorithm Both expert and normal programmers can make mistakes

  19. PERSISTENT MEMORY PROGRAMMING IS HARD Detect crash consistency bugs We need a tool to detect crash consistency bugs!

  20. CHALLENGE: PM Programming is Hard! Requirements and Key Ideas NON-VOLATILE MEMORY PERSISTENT MEMORY PMTEST: Interface and Mechanism SIGMETRICS’14 Programming Persistent Memory Applications ASPLOS’19 Results

  21. REQUIREMENTS OF THE TOOL Flexible Fast PM Libraries Kernel Modules Custom Programs Existing HW Future HW and Models [PMDK, NV-Heaps’11, Mnemosyne’11, ATLAS’14, REWIND’15, NVL-C’16, NVThreads’17 LSNVMM’17, etc.] [PMFS’14, BPFS’09, NOVA’16, NOVA-Fortis’17, Strata’17, SCMFS’11 etc.] [DPO’16, HOPS’17, etc.] E.g., custom database, key-value store, etc. [x86, ARM, etc.]

  22. PMTEST KEY IDEAS: FLEXIBLE • Many different programming models and hardware primitives available PM Program PM Kernel Module PM Program Call library Call library Mnemosyne Library PMDK Library write, sfence, clwb write, dc cvap, dsb write, sfence, clwb ARM x86 x86 The challenge is to support different hardware and software models

  23. PMTEST KEY IDEAS: FLEXIBLE Operations that maintain crash consistency are similar: orderinganddurability guarantees PM Program PM Kernel Module PM Program Call library Call library Mnemosyne Library PMDK Library write, sfence, clwb write, dc cvap, dsb write, sfence, clwb ARM x86 x86 Our key idea is to test for these two fundamental guarantees which in turn can cover all hardware-software variations

  24. PMTEST KEY IDEAS: FAST sfence write C write A write B ... sfence sfence write B write A write C ... sfence sfence write A write B write C ... sfence sfence write A write C write B ... sfence sfence write B write C write A ... sfence sfence write C write B write A ... sfence • Prior work [Yat’14] uses exhaustive testing n O(n!) Recoverable? Exhaustive testing is time consuming and not practical

  25. PMTEST KEY IDEAS: FAST sfence write C write B write A ... sfence • Reduce test time by using only one dynamic trace Runtime Trace Persistent Memory Application Recoverable? A significant improvement over O(n!) testing

  26. PMTEST KEY IDEAS: FAST • PMTestinfers the persistence intervalfrom PM operation traceThe interval in which a write can possibly become persistent write A A clwb A sfence A persists before B write B B clwb B sfence Trace Timeline A disjoint interval indicates that no re-ordering in the hardware will lead to a case where A does not persist before B

  27. PMTEST KEY IDEAS: FAST • PMTestinfers the persistence intervalfrom PM operation traceThe interval in which a write can possibly become persistent write A Interleaving A write B B clwb A sfence A may NOT persist before B clwb B sfence Trace Timeline An overlapping interval indicates that there is a case where A does not persist before B

  28. PMTEST KEY IDEAS: FAST • PMTestinfers the persistence intervalfrom PM operation traceThe interval in which a write can possibly become persistent write A A write B B clwb A sfence clwb B sfence A persists before B? No Trace Timeline Querying the trace can detect any violation in ordering and durability guarantee at runtime

  29. CHALLENGE: PM Programming is Hard! Requirements and Key Ideas NON-VOLATILE MEMORY PERSISTENT MEMORY PMTEST: Interface and Mechanism SIGMETRICS’14 Programming Persistent Memory Applications ASPLOS’19 Results and Conclusion

  30. PMTEST OVERVIEW Testing Annotation Checking Rules Testing Results PMTest Persistent Memory Application Offline Online

  31. PMTEST OVERVIEW Testing Annotation Checking Rules Testing Results PMTest Persistent Memory Application Offline Online

  32. PMTEST INTERFACE Normal Expert PMTest • Assertion-like low-level interface • Check behavior vs. specification • High-level interface • Minimize programmer’s effort • Automatically inject low-level checkers PMTest provides two different interfaces

  33. PMTESTLOW-LEVEL INTERFACE • Two low-level checkers • isOrderedBefore(A, sizeA, B, sizeB) Checks whether A is persisted before B (Ordering guarantee) • IsPersisted(A, sizeA) Checks whether A has been written back to PM (Durability guarantee)

  34. PMTESTLOW-LEVEL INTERFACE • Two low-level checkers • isOrderedBefore(A, sizeA, B, sizeB) Checks whether A is persisted before B (Ordering guarantee) • IsPersisted(A, sizeA) Checks whether A has been written back to PM (Durability guarantee) • Help check if implementation meets specification for • Programs/kernel modules based on low-level primitives • PM libraries

  35. EXAMPLE void hashMapRemove() { ... remove(buckets->bucket[hash]); count--; persist_barrier(); ... hashmap_rebuild(); isOrderedBefore(&count, sizeof(unsigned), &hashmap, sizeof(hashmap)); isPersisted(&hashmap, size); } Check if count has been persisted before rebuilding Check if all updates have been persisted in rebuilding PMTest helps the programmers to reason about the code *This example is inspired by hashmap_atomicfrom PMDK

  36. PMTESTLOW-LEVEL INTERFACE • Two low-level checkers • isOrderedBefore(A, sizeA, B, sizeB) Check whether A is persisted before B (Ordering guarantee) • IsPersisted(A, sizeA) Check whether A has been written back to PM (Durability guarantee) • Help check if implementation meets specification for • Programs/kernel modules based on low-level primitives • PM libraries • Further enables high-level checkers to automate testing

  37. PMTESTHIGH-LEVEL INTERFACE • Currently provides high-level checkers for PMDK transactions • Automatically detects crash consistency bugs void ListAppend(item_tnew_val) { TX_CHECKER_START; //Start of TX checker TX_BEGIN { node_t *new_node = makeNode(new_val); TX_ADD(list.head, sizeof(node_t*)); List.head = new_node; List.length++; } TX_END TX_CHECKER_END; //End of TX checker } Automatically check if there is a backup before update Automatically check if all updates have been persisted * This example does not include initialization and communication with PMTest

  38. PMTESTHIGH-LEVEL INTERFACE • Currently provides high-level checkers for PMDK transactions • Automatically detects crash consistency bugs • If all updates have been persisted at the end of the transaction • If there is a backup before update during the transaction • Automatically detects performance bugs • Redundant log/backup • Duplicated writeback/flush operations (for all programs) High-level checkers minimize programmer’s effort

  39. PMTEST OVERVIEW Testing Annotation Checking Rules Testing Results PMTest Persistent Memory Application Offline Online

  40. PMTEST CHECKING MECHANISM Auto inject low-level checkers for high-level checkers for (...) { TX_CHECKER_START; TX_BEGIN; ... TX_END; TX_CHECKER_END; PMTest_SEND_TRACE; } ... write A write B clwb B sfence TX_END Checking Engine Checking Engine PM Trace Result: At Runtime Ais not persistent! The checking engine tests the trace

  41. CHECKING ENGINE ALGORITHM • Infer the persistence interval in which a write can become persistent • Check the interval against the low-level checkers sfence sfence B A write A A and B can be persisted any time clwb  A write B sfence sfence isOrderedBeforeA B B may not persist Time isPersist B PM Trace Persistence Interval Our interval-based check enables faster testing

  42. CHALLENGE: PM Programming is Hard! Requirements and Key Ideas NON-VOLATILE MEMORY PERSISTENT MEMORY PMTEST: Interface and Mechanism SIGMETRICS’14 Programming Persistent Memory Applications ASPLOS’19 Results and Conclusion

  43. METHODOLOGY Platform CPU: 8-core Skylake2.1GHz, OS: Ubuntu 14.04, Linux kernel 4.4 Memory: 64GB DDR4 NVM: 64GB Battery-backed NVDIMM Workloads Micro-benchmarks (from PMDK) • C-Tree • B-Tree • RB-Tree • HashMap • Real-world workloads • PM-optimized file system • Intel’s PMFS (kernel module) • PM-optimized database • Redis (PMDK Library) • Memcached (Mnemosyne Library) Baselines • No testing tool • With Intel’s Pmemcheck (only for PMDK-based programs)

  44. MICRO-BENCHMARK Speedup vs. Pmemcheck (Transaction) PMTest is 7.1X faster than Pmemcheck

  45. REAL-WORLD WORKLOADS 22X slowdown with Pmemcheck PMTest has < 2X overhead in real-world workloads

  46. BUG DETECTION • New bugs found • 1 crash consistency bug in PMDK applications • 1 performance bug in PMFS • 1 performance bug in PMDK applications • Validated with • 42 synthetic bugs injected to micro-benchmarks • 3 existing bugs from commit history

  47. CONCLUSION • It is hard to guarantee crash consistency in persistent memory applications • Our tool PMTestis fast and flexible • Flexible: Supports kernel modules, custom PM programs, transaction-based programs • Fast: Incurs < 2X overhead in real-workload applications • PMTest has detected 3 new bugs in PMFS and PMDK applications pmtest.persistentmemory.org PMTest

  48. PMTEST Testing Persistent Memory Applications Samira Khan

More Related