150 likes | 279 Views
Project 1: DRAM timing violation due to PV. Due to PV, transistor and capacitor may have variations in their dimensions, causing charging time of a cell to vary Situation is becoming worse with smaller technologies Threatens yield. Initial Study. Distribution Data. Challenges: Maintain yield
E N D
Project 1: DRAM timing violation due to PV • Due to PV, transistor and capacitor may have variations in their dimensions, causing charging time of a cell to vary • Situation is becoming worse with smaller technologies • Threatens yield
Distribution Data • Challenges: Maintain yield • How to overcome slow-to-write cells? • Naïve Solution: there are both fast and slow cells. Can fast cells balance slow cells?
Issues • Cannot do it at cell granularity. The memory controller would not be able to handle different write speed at cell level • A practical way is to handle it at the row level • The write speed of a row is determined by its slowest cell: is it good enough, or do we need a different granularity, say a chunk (sub-row, super-row)? • Cost of fine granularity: the memory controller needs to bookkeep the information – a huge hardware overhead
Issues continued • Problem of coarse granularity: limited by the slowest cell. May not be able to exploit the fast cells • Question 1: what is the best granularity that the memory controller should consider in distinguishing different write speeds?
Another Question • Suggestions: use only a few write times, and put memory chunks into bins • Question 2: given a chunk size, how to decide ?
A Reference Reading • Bo Zhao et al. “Variation-Tolerant Non-Uniform 3D Cache Management in Die Stacked Multicore Processor”, in MICRO 2009.
Tools You May Need • DRAMSim: • Paul Rosenfeld, Elliott Cooper-Balis, and Bruce Jacob. Dramsim2: A cycle accurate memory system simulator. IEEE Comput. Archit. Lett., 10(1):16–19, January 2011. • VARIUS: • S. Sarangi, B. Greskamp, R. Teodorescu, J. Nakano, A. Tiwari, and J. Torrellas. Varius: A model of process variation and resulting timing errors for microarchitects. IEEE Transactions on Semiconductor Manufacturing, 21(1):3–13, 2008
WoM Encoding for PCM • Slow write operation • Write blocks read, causes slowdown • SET: long latency (~8x of read) • RESET: short latency (~ same to read) RESET SET Power time
PreSET Scheme [Qureshi_hpca’13] • Exploit asymmetry (slow SET vs. fast RESET) in write operations • Perform SET ahead of actual write • Proactively SETs (proactive-SET) dirty cache line; • Only RESETs are performed when actually written (write-back write) to memory (fast write). DRAM$ Eviction to PCM Memory Proactive SET 10101010 10101010 ✓fast ✗slow 01010101 11111111 11111111 01010101 11111111 01010101 01010101
PreSET Increases no. of Bit Changes • For 128B line, • Baseline sets 91 bits and reset 77 bits; • PreSET sets 180 and resets 200. 1.98X 2.6X
PreSET Overall Effects • Positive: • Improves performance by 34% • Decreases Energy-Delay-Product (EDP) by 24% • Drawbacks: • Greatly increases write power (225%) & system power (30%); • Impairs lifetime of PCM, ~60%. Can we cut down PreSET’s power consumption without losing performance?
Write-Once Memory (WoM) Original WoM code • First introduced for uni-direction write-once memories: 01 [Rivest & Shamir’82] • Recently adopted in Flash [Jiang A.’07] • Cut the no. of erasures by half • Improved write performance and lifetime 000 111 111 001 110 010 101 101 Both writes have RESETs only 100 011 011 00 WoM code for PCM 01 11 10 01 2-bit data 111 110 1st-write 011 101 001 010 2nd-write 100 000 00 10 11
WoM-SET • A proactive-SET based write scheme using WoM code. Baseline PreSET WoM-SET 01 01 01 01 01 01 01 01 110 001 110 110 Memory Line 11 11 11 11 111111 111 111 time 10 01 01 00 1001 01 00 101 110 110 111 5RESET, 5SET 3RESET, 1SET 11RESET, 9SET 1111 11 11 10 00 01 00 1000 01 00 101 000 110 111
Questions to Solve • What if we just apply WoM codes to the baseline (i.e. without PreSET)? How would that improve (or degrade) the baseline? • After applying code 1 and code 2, how to proceed on the third write of a cell? • Option 1, write code 1 directly • Option 2, use PreSET and code 1 Which one is better?