380 likes | 631 Views
Improving Privacy and Lifetime of PCM-based Main Memory. Jingfei Kong , Huiyang Zhou. School of EECS University of Central Florida. Department of Electrical and Computer Engineering North Carolina State University. Phase Change Memory (PCM) A New Emerging Memory Technology. SRAM. cost.
E N D
Improving Privacy and Lifetime of PCM-based Main Memory Jingfei Kong, Huiyang Zhou School of EECS University of Central Florida Department of Electrical and Computer Engineering North Carolina State University
Phase Change Memory (PCM)A New Emerging Memory Technology SRAM cost DRAM flash memory PCM Hard Disk Drive Performance Geoffrey W. Burr, IBM, Non-volatile Memories Workshop 2010 University of Central Florida
The Next Generation Memory Hierarchy Processor Core L2 Cache (SRAM) DRAM cache PCM-based main memory (PRAM) flash memorydisk cache hard disk drive University of Central Florida
Challenges for PCM-based Main Memory(PRAM) #1 Privacy • PCM is a non-volatile memory technology • contents can last for 10 years without power • contribute to low power consumption University of Central Florida
Challenges for PCM-based Main Memory(PRAM) #1 Privacy • PCM is a non-volatile memory technology • The non-volatility poses a privacy concern Your social security number, your medical record, your emails… University of Central Florida
Challenges for PCM-based Main Memory(PRAM) #1 Privacy • PCM is a non-volatile memory technology • The non-volatility poses a privacy concern • The content needs to be encrypted for privacy University of Central Florida
[Lie et al. ASPLOS 2000] Security Model Processor Core CPU Cache secure domain (on chip) Encryption Engine ??? insecure domain (off chip) Main Memory (PRAM) (Encrypted Data) University of Central Florida
Secure Processor Architecture with Counter-mode Encryption cache [Chhabra et.al. TACO 09] counter data block Block • Encryption engine counter per cache line secure domain insecure domain PRAM (Encrypted Data) University of Central Florida
Security Requirement of Counter–mode Encryption counter data block counter for the to-be-written-back cache line counter for the to-be-written-back cache line incremented by one for every cache-line written back to main memory University of Central Florida
Challenges for PCM-based Main Memory (PRAM) #2 Limited Lifetime PCM memory cells have a limited number of write programming cycles NAND flash PCM DRAM Write Endurance 108-109 ~∞ 105-106 University of Central Florida
Wear-Leveling for PRAM – Rotation/Swapping cache PRAM University of Central Florida
Wear-Leveling for PRAM – Write Traffic Reduction cache PRAM University of Central Florida
Outline • Motivation • Encryption impact on wear-leveling techniques for PRAM • An adaptive ECC management scheme to improve PRAM lifetime • Conclusions University of Central Florida
Write Traffic Reduction: Redundant Bit-write Removal [Zhou et al. ISCA 09] 128-bit data 0x00000000000000000000000000000000 1 Only 1 bit is written • 0x01000000000000000000000000000000 PRAM • 0x01000000000000000000000000000000 University of Central Florida
Encryption Impact on Redundant Bit-write Removal cache 0x00000000000000000000000000000000 • Encryption engine counter data block encrypted counter data block • 0xBB4D3007975CC603475D4F1FAEE50FB7 0x00000000000000000000000000000000 Only 1 bit is different for non-encrypted and needs to be written 65 bits are different for encrypted and need to be written PRAM • 0xA44CD5B097033E6D15F9317BC9D664B0 • 0xA44CD5B097033E6D15F9317BC9D664B0 0x01000000000000000000000000000000 0x01000000000000000000000000000000 University of Central Florida
Write Traffic Reduction: Partial Writes CPU write to Data Block 1 [Lee et al. ISCA 09] • Data Block 1 • Data Block 1 • Data Block 2 • Data Block 3 • Data Block 4 1 0 0 0 0 dirty vector Each dirty bit is used to monitor whether the corresponding data block is modified or not • Old Data Block 1 PRAM • Old Data Block 2 • Old Data Block 3 University of Central Florida
Encryption Impact on Partial Writes • cache line counter • Data Block 1 • Data Block 2 • Data Block 3 • Data Block 4 1 1 0 1 0 1 1 0 dirty vector dirty vector • Encryption engine cache line counter+1 cache line counter • Encrypted Data Block 3 • Encrypted Data Block 4 • Encrypted Data Block 1 • Encrypted Data Block 2 counter data block The whole cache line has to be written University of Central Florida
A New Encryption Counter Scheme to Mitigate the Impact on Partial Writes counter data block address info counter per cache line padding new counter data block address info counter per cache line counter per encryption block padding University of Central Florida
Methodology • Experimental Setup • MIPS-like cycle-accurate timing simulator based on SimpleScalar • Encryption Engine: The Advanced Encryption Standard (AES) • 128-bit data block, 128-bit key, 80-cycle latency • PCM-based main memory • 4GB, 1024-cycle latency, write endurance 108 • wear-leveling techniques: ideal rotation/swapping • Memory benchmarks with high cache miss rates from SPEC 2000 and SPEC 2006 • Lifetime Estimation = PCM_write_endurance * benchmark_memory_footprint/benchmark_write_traffic_rate University of Central Florida
Encryption Impact on Redundant Bit-write RemovalWrite Traffic Comparison 50% 50% write traffic reduction 5% 95% write traffic reduction around 10x gap University of Central Florida
Encryption Impact on Redundant Bit-write RemovalLifetime Comparison 51.3 years around 20x gap 2.6 years 1.3 years University of Central Florida
Encryption Impact on Partial Writes Write Traffic Comparison 100% around 2.7x gap 37%reduce 63% write traffic University of Central Florida
Encryption Impact on Partial Writes Lifetime Comparison 3.9 years around 3x gap 1.3 years 1.3 years University of Central Florida
Impact of New Encryption Scheme on Partial Writes Lifetime Comparison 4.9 years when combined with redundant bit-write removal improve from 1.3 to 2.5 years University of Central Florida
Outline • Motivation • Encryption impact on wear-leveling techniques for PRAM • An adaptive ECC management scheme to improve PRAM lifetime • Conclusions University of Central Florida
Error Correcting Code (ECC) • ECC stands for Error Correcting Code • use extra memory storage to detect and correct memory bit errors • the amount of ECC storage depends on the number of bit errors to be detected and corrected 512-bit data 10-bit BCH ECC 10 more BCH ECC bits correct 1-bit error correct 1 more bit error University of Central Florida
Dynamic Requirement for ECC to Protect PRAM expected PRAM storage allocated for ECC (maximum) PRAM memory failures Minimum ECC storage required expected lifetime time time University of Central Florida
An Adaptive ECC Management Scheme • Monitor PRAM wear-out status • leverage encryption counters to obtain the number of writes • Adjust ECC protection level adaptively based on PRAM wear-out status • data and associated ECC are stored in a unified memory space to avoid fixed memory allocation for ECC • data pages with associated ECC pages are grouped together in the physical memory and all data pages in the group share the same level of ECC protection University of Central Florida
Unified ECC and Data Memory Space ECCpages data pages PRAM Group i (N pages) Group j (N pages) University of Central Florida
PRAM Architecture with ECC and Encryption cache • Encryption engine • ECC logic ECCpages data pages PRAM Group i (N pages) Group j (N pages) University of Central Florida
PRAM Architecture with ECC and Encryption cache • Encryption engine • ECC logic ECCpages data pages PRAM Group i (N pages) Group j (N pages) University of Central Florida
Conclusions • Phase Change Memory (PCM) is a promising new memory technology. • Encryption is necessary to protect the privacy of PCM-based main memory (PRAM) • Encryption has significant impact on various wear-leveling techniques. • To improve PRAM lifetime, we propose an adaptive ECC management scheme. University of Central Florida
Thank You and Questions? University of Central Florida
The Architecture cache Block • Encryption engine secure domain insecure domain PRAM (Encrypted Data) … LPID LC (non-encrypted Data) University of Central Florida
Methodology • Experimental Setup • MIPS-like cycle-accurate timing simulator based on SimpleScalar • 32KB 2-way L1 data cache, block size 64 bytes • 1MB 16-way L2 unified cache, block size 256 bytes, 1024-cycle miss penalty • Encryption Engine: The Advanced Encryption Standard (AES) • 128-bit data block, 128-bit key, 80-cycle latency • PCM-based main memory • 4GB, 1024-cycle latency, write endurance 108 • wear-leveling techniques: ideal rotation/swapping • Benchmarks with high cache miss rates • ammp, art, equake, mcf, swim, vpr from SPEC 2000; lbm, mcf, milc and sphinx3 from SPEC 2006. • Lifetime Estimation = PCM_write_endurance * benchmark_memory_footprint/benchmark_write_traffic_rate University of Central Florida
A New Encryption Counter Scheme to Mitigate the Impact on Partial Writes • encryption block counter1+1 • encryption block counter1 • encryption block counter2 • encryption block counter3 … • Data Block 1 • Data Block 2 • Data Block 3 • Data Block 4 1 1 0 0 0 0 0 0 dirty vector dirty vector • Encryption engine address info cache line counter block counter1+1 block counter block counter2 block counter4 block counter3 • Encrypted Data Block 3 • Encrypted Data Block 2 • Encrypted Data Block 1 • Encrypted Data Block 4 Only the dirty data block has to be written University of Central Florida
Effectiveness of Our New Encryption Counter Scheme on Partial Writes University of Central Florida