1 / 58

CROW A Low-Cost Substrate for Improving DRAM Performance, Energy Efficiency, and Reliability

This paper introduces CROW, a flexible substrate for improving DRAM performance, energy efficiency, and reliability. CROW-cache reduces DRAM latency by using copy rows, while CROW-ref reduces refresh overhead by remapping weak rows to strong copy rows. The paper also discusses the challenges of DRAM scaling and evaluates the effectiveness of CROW.

cirizarry
Download Presentation

CROW A Low-Cost Substrate for Improving DRAM Performance, Energy Efficiency, and Reliability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CROWA Low-Cost Substrate for Improving DRAM Performance, Energy Efficiency, and Reliability Hasan HassanMinesh Patel Jeremie S. Kim A. GirayYaglikci Nandita Vijaykumar Nika Mansouri GhiasiSaugata Ghose Onur Mutlu

  2. Summary Source code available in July: github.com/CMU-SAFARI/CROW Challenges of DRAM scaling: • High access latency→ bottleneck for improving system performance/energy • Refresh overhead→ reduces performance and consume high energy • Exposure to vulnerabilities (e.g., RowHammer) Copy-Row DRAM (CROW) • Introduces copy rows into a subarray • The benefits of a copy row: • Efficiently duplicating data from regular row to a copy row • Quick access to a duplicated row • Remapping a regular row to a copy row CROW is a flexible substrate with many use cases: • CROW-cache & CROW-ref • Mitigating RowHammer • We hope CROW enables many other use cases going forward (20% speedup andconsumes22% less DRAM energy)

  3. Outline 1. DRAM Operation Basics 2. The CROW Substrate CROW-cache: Reducing DRAM Latency CROW-ref: Reducing DRAM Refresh Mitigating RowHammer 3.Evaluation 4. Conclusion

  4. DRAM Organization DRAM Cell DRAM Subarray DRAM Row Memory Bus CPU Memory Controller Sense Amplifier

  5. Accessing DRAM DRAM Cell DRAM Subarray DRAM Row Activate Precharge Read Sense Amplifier

  6. Outline 1. DRAM Operation Basics 2. The CROW Substrate CROW-cache: Reducing DRAM Latency CROW-ref: Reducing DRAM Refresh Mitigating RowHammer 3.Evaluation 4. Conclusion

  7. Challenges of DRAM Scaling access latency 1 refresh overhead 2 exposure to vulnerabilities 3

  8. Our Goal We want a substrate that enables the duplication and remapping of data within a subarray

  9. The Components of CROW DRAM Subarray CROW-table Memory Controller

  10. CROW Operation 1: Row Copy DRAM Subarray ACT-c (copy) Memory Controller

  11. Row Copy: Steps Restoration of both rows to source data source row: Activation of the destination row Activation of the source row Charge sharing Beginning of restoration 1 2 3 4 5 destination row: Sense Amplifier

  12. Row Copy: Steps Restoration of both rows to source data source row: Activation of the destination row Activation of the source row Charge sharing Beginning of restoration 1 2 3 4 5 destination row: Enables quickly copying a regular row into a copy row Sense Amplifier

  13. CROW Operation 2: Two-Row Activation DRAM Subarray ACT-t (two row) Memory Controller

  14. Two-Row Activation: Steps Activation of two rows Charge sharing Restoration 1 2 3 both charged or discharged fast Sense Amplifier

  15. Two-Row Activation: Steps Activation of two rows Charge sharing Restoration 1 2 3 both charged or discharged fast Enables fast access to data that is duplicated across a regular row and a copy row Sense Amplifier

  16. Outline 1. DRAM Operation Basics 2. The CROW Substrate CROW-cache: Reducing DRAM Latency CROW-ref: Reducing DRAM Refresh Mitigating RowHammer 3.Evaluation 4. Conclusion

  17. CROW-cache Problem:High access latency Key idea: Use copy rows to enable low-latency access to most-recently-activated regular rows in a subarray CROW-cache combines: • row copy → copy a newly activated regular row into a copy row • two-row activation → activate the regular row and copy row together on the next access Reduces activation latency by 38%

  18. CROW-cache Operation Request Queue load row X [bank conflict] DRAM Subarray load row X ACT-c CROW-table miss Issue ACT-c (copy) Allocate a copy row Issue ACT-t (two row) CROW-table hit 3 2 1 2 1 ACT-t CROW-table Memory Controller copy row 0 row X

  19. CROW-cache Operation Request Queue load row X [bank conflict] DRAM Subarray load row X Allocate a copy row CROW-table hit CROW-table miss Issue ACT-t Issue ACT-c 1 2 2 1 3 ACT-t Second activation of row X is faster CROW-table Memory Controller copy row 0 row X

  20. Outline 1. DRAM Operation Basics 2. The CROW Substrate CROW-cache: Reducing DRAM Latency CROW-ref: Reducing DRAM Refresh Mitigating RowHammer 3.Evaluation 4. Conclusion

  21. CROW-ref Problem:Refresh has high overheads. Weak rows lead to high refresh rate • weak row: at least one of the row’s cells cannot retain data correctly when refresh rate is decreased Key idea: Safelyreduce refresh rate by remappinga weak regular row to a strongcopy row CROW-ref uses: • row copy → copy a weak regular row to a strong copy row CROW-ref eliminates more than half of the refresh requests

  22. CROW-ref Operation strong strong weak strong strong If remapped, activate a copy row Perform retention time profiling Remap weak rows to strong copy rows On ACT, check the CROW-table 1 3 4 2 strong Retention Time Profiler

  23. CROW-ref Operation strong strong weak strong strong If remapped, activate a copy row Remap weak rows to strong copy rows Perform retention time profiling On ACT, check the CROW-table 3 4 2 1 strong Retention Time Profiler How many weak rows exist in a DRAM chip?

  24. Identifying Weak Rows Weak cells are rare [Liu+, ISCA’13] weak cell: retention < 256ms ~1000/238 (32 GiB) failing cells DRAM Retention Time Profiler • REAPER [Patel+, ISCA’17]PARBOR [Khan+, DSN’16]AVATAR [Qureshi+, DSN’15] • At system boot or during runtime

  25. Identifying Weak Rows Weak cells are rare [Liu+, ISCA’13] weak cell: retention < 256ms ~1000/238 (32 GiB) failing cells DRAM Retention Time Profiler • REAPER [Patel+, ISCA’17]PARBOR [Khan+, DSN’16]AVATAR [Qureshi+, DSN’15] • At system boot or during runtime A few copy rows are sufficient to halve the refresh rate

  26. Outline 1. DRAM Operation Basics 2. The CROW Substrate CROW-cache: Reducing DRAM Latency CROW-ref: Reducing DRAM Refresh Mitigating RowHammer 3.Evaluation 4. Conclusion

  27. Mitigating RowHammer victim aggressor precharge victim activate Key idea: remap victim rows to copy rows

  28. Outline 1. DRAM Operation Basics 2. The CROW Substrate CROW-cache: Reducing DRAM Latency CROW-ref: Reducing DRAM Refresh Mitigating RowHammer 3.Evaluation 4. Conclusion

  29. Methodology • Simulator • DRAM Simulator (Ramulator [Kim+, CAL’15]) https://github.com/CMU-SAFARI/ramulator • Workloads • 44 single-core workloads • SPEC CPU2006, TPC, STREAM, MediaBench • 160 multi-programmed four-core workloads • By randomly choosing from single-core workloads • Execute at least 200million representative instructions per core • System Parameters • 1/4 core system with 8 MiB LLC • LPDDR4 main memory • 8 copy rows per 512-row subarray Source code available in July: github.com/CMU-SAFARI/CROW

  30. CROW-cache Results 7.5% 7.1% 8.2% 6.9% * with 8 copy rows and a 64Gb DRAM chip (sensitivity in paper)

  31. CROW-cache Results 7.5% 7.1% 8.2% 6.9% CROW-cache improvessingle-/four-core performance and energy * with 8 copy rows a 64Gb DRAM chip (sensitivity in paper)

  32. CROW-ref Results 11.9% 7.8% 17.2% 7.1% * with 8 copy rows and a 64Gb DRAM chip (sensitivity in paper)

  33. CROW-ref Results 11.9% 7.8% 17.2% 7.1% CROW-ref significantly reduces the performance and energy overhead of DRAM refresh * with 8 copy rows a 64Gb DRAM chip (sensitivity in paper)

  34. Combining CROW-cache and CROW-ref 20% 22% 17% 23%

  35. Combining CROW-cache and CROW-ref 20% 22% 17% 23% CROW-(cache+ref) provides more performance and DRAM energy benefits than each mechanism alone

  36. Hardware Overhead For 8 copy rows and 16 GiB DRAM: • 0.5% DRAM chip area • 1.6% DRAM capacity • 11.3 KiB memory controller storage CROW is a low-cost substrate

  37. Other Results in the Paper • Performance and energy sensitivity to: • Number of copy-rows per subarray • DRAM chip density • Last-level cache capacity • CROW-cache with prefetching • CROW-cache compared to other in-DRAM caching mechanisms: • TL-DRAM [Lee+, HPCA’13] • SALP [Kim+, ISCA’12]

  38. Outline 1. DRAM Operation Basics 2. The CROW Substrate CROW-cache: Reducing DRAM Latency CROW-ref: Reducing DRAM Refresh Mitigating RowHammer 3.Evaluation 4. Conclusion

  39. Conclusion Source code available in July: github.com/CMU-SAFARI/CROW Challenges of DRAM scaling: • High access latency→ bottleneck for improving system performance/energy • Refresh overhead→ reduces performance and consume high energy • Exposure to vulnerabilities (e.g., RowHammer) Copy-Row DRAM (CROW) • Introduces copy rows into a subarray • The benefits of a copy row: • Efficiently duplicating data from regular row to a copy row • Quick access to a duplicated row • Remapping a regular row to a copy row CROW is a flexible substrate with many use cases: • CROW-cache & CROW-ref • Mitigating RowHammer • We hope CROW enables many other use cases going forward (20% speedup andconsumes 22% less DRAM energy)

  40. CROWA Low-Cost Substrate for Improving DRAM Performance, Energy Efficiency, and Reliability Hasan HassanMinesh Patel Jeremie S. Kim A. GirayYaglikci Nandita Vijaykumar Nika Mansouri GhiasiSaugata Ghose Onur Mutlu

  41. Backup Slides

  42. Latency Reduction with MRA

  43. Mitigating RowHammer victim aggressor precharge victim activate Key idea: remap victim rows to copy rows

  44. CROW-cache Performance 7.5% 6.6% 7.1% 0.7% single-core four-core

  45. CROW-ref Performance 11.9% 7.1% four-core four-core single-core single-core

  46. CROW-ref Energy Savings 17.2% 7.8% single-core four-core

  47. Speedup - CROW-cache Single-core

  48. Speedup - CROW-cache Four-core

  49. Energy – CROW-cache

  50. Comparison to TL-DRAM and SALP

More Related