260 likes | 277 Views
Learn about efficient backend memory operations to enhance security, bandwidth, endurance, and latency of non-volatile memory systems. Discover the Janus framework for parallelization and pre-execution, optimizing NVM performance.
E N D
JanusOptimizing Memory and Storage Support for Non-Volatile Memory Systems Sihang Liu KorakitSeemakhupt, Gennady Pekhimenko, AasheeshKolli, and Samira Khan
BACKGROUND • The new non-volatile memory (NVM) is • High-speed • Persistent • Byte-addressable • NVM allows program to manipulate persistent data in memory through a load/store interface • Different from conventional DRAM, NVM is bothMemoryandStorage Intel 3D XPoint A practical NVM system requires both memory and storage support
MEMORY AND STORAGE SUPPORT These supports are designed for Prevent attackers from stealing or tampering data Encryption, integrity verification, etc. Security Improve NVM’s limited bandwidth Deduplication, compression, etc. Bandwidth Extend NVM’s limited lifetime Wear-leveling, error correction, etc. Endurance We refer to these memory and storage supports as backend memory operations
BACKEND MEMORY OPERATION LATENCY Cache Writeback Memory Controller NVM Access Write Access Timeline
BACKEND MEMORY OPERATION LATENCY Cache Writeback Volatile Memory Controller Backend Memory Operations Non-volatile NVM Access Write Access Timeline
BACKEND MEMORY OPERATION LATENCY Cache Writeback Volatile Memory Controller Backend Memory Operations Non-volatile ~15 ns NVM Access >100 ns Write Access Timeline Latency to Persistence
WRITE LATENCY IN NVM PROGRAMS Writeback from cache Backup persist_barrier Update Commit Timeline Example: An undo logging transaction
WRITE LATENCY IN NVM PROGRAMS Backup Update Commit Write latency is on critical path Timeline Crash consistency mechanism puts write latency on the critical path
WRITE LATENCY IN NVM PROGRAMS Backup Backup Update Update Commit Backend memory operations Commit Increased latency Timeline Backend memory operation increases the writeback latency
Backend memory operations are on the critical path, How to reduce the latency?
OBSERVATION Each backend memory operation seems indivisible Integration leads to serialized operations An Example: Counter-mode Encryption Integrity Verification Deduplication
OBSERVATION However, it is possible to decompose them into sub-operations Generate counter An Example: Decompose Encrypt counter Data Encrypted counter Generate MAC (for integrity verification) Counter-mode Encryption
KEY IDEA I: PARALLELIZATION After decomposing the example operations: Counter-mode Encryption Integrity Verification Deduplication
KEY IDEA I: PARALLELIZATION There are two types of dependencies: Intra-operation dependency Counter-mode Encryption Integrity Verification Deduplication
KEY IDEA I: PARALLELIZATION There are two types of dependencies: Inter-operation dependency Intra-operation dependency Counter-mode Encryption Integrity Verification Deduplication
KEY IDEA I: PARALLELIZATION There are two types of dependencies: Inter-operation dependency Intra-operation dependency Parallelizable Counter-mode Encryption Integrity Verification Sub-operations without dependency can execute in parallel Deduplication
KEY IDEA I: PARALLELIZATION There are two types of dependencies: Inter-operation dependency Intra-operation dependency Parallelizable Counter-mode Encryption Integrity Verification Sub-operations without dependency can execute in parallel Deduplication
KEY IDEA II: PRE-EXECUTION External dependency A write consists of: Address Data Sub-operations can pre-execute as soon as their data/address dependency is resolved Counter-mode Encryption Integrity Verification Deduplication
KEY IDEA II: PRE-EXECUTION Address-dependent A write consists of: Address Data Address-dependent sub-operations can pre-execute as soon as the address of the write is available Counter-mode Encryption Integrity Verification Deduplication
KEY IDEA II: PRE-EXECUTION Data-dependent A write consists of: Address Data Data-dependent sub-operations can pre-execute as soon as the data of the write is available Counter-mode Encryption Integrity Verification Deduplication
KEY IDEA II: PRE-EXECUTION Both-dependent A write consists of: Address Data Both-dependent sub-operations can pre-execute as soon as both the data and address of the write are available Counter-mode Encryption Integrity Verification Deduplication
JANUS OVERVIEW Backup Backup Update Update Janus: Commit • Parallelization Commit Timeline Parallelized Serialized Parallelization reduces the latency of each operation
JANUS OVERVIEW Backup Backup Backup Update Update Update Update Janus: Commit Commit Commit • Parallelization Commit • Pre-execution Timeline Parallelized Pre-executed Serialized Pre-execution moves their latency off the critical path
PERFORMANCE • Janus provides a software interface to issue pre-execution • Compared to baseline with serialized operations: 2.35Xspeedup Manual: 2Xspeedup Janus Automated:
JanusOptimizing Memory and Storage Support for Non-Volatile Memory Systems Sihang Liu KorakitSeemakhupt, Gennady Pekhimenko, AasheeshKolli, and Samira Khan