140 likes | 297 Views
Storage Class Memory Architecture for Energy Efficient Data Centers. Bruce Childers, Sangyeun Cho , Rami Melhem , Daniel Mossé , Jun Yang, Youtao Zhang Computer Science Department University of Pittsburgh. Server power consumption. (Watts). (2,972W). (1,614W). Memory. Processors.
E N D
Storage Class Memory Architecturefor Energy Efficient Data Centers Bruce Childers, Sangyeun Cho, Rami Melhem, Daniel Mossé, Jun Yang, Youtao Zhang Computer Science Department University of Pittsburgh
Server power consumption (Watts) (2,972W) (1,614W) Memory Processors (Lefurgy et al., ’03)
Challenges with DRAM • Power wall • Large fractions of system power consumed in DRAM • Cost wall • Memory accounts for a major fraction of overall server cost • Scaling wall • DRAM scaling becomes harder and harder • Higher speed (bandwidth) means faster clocking • Larger size = increase of loading (on buses) and refresh overheads (power & performance)
New non-volatile memory to rescue US Patents Granted • Non-volatile • Byte-addressable • Acceptable performance • Good scaling potential • * Subject to write endurance limit PCM (PRAM) MRAM FRAM (Lam, VLSI-TSA ’08)
Agenda Storage class memory architecture Industry progress Our vision Some research questions
Storage class memory architecture PCM is slow and write endurance limited; we need DRAM buffering L1 $$ L1 $$ L2 $$ DRAM Smart Mem-ctrl PCM-Large “Smart mem. controller” to handle diff. technologies; cache mgmt, wear leveling, error handling (ECC, sparing), trim & low-level scheduling PCM-Small This is PCM working memory; a better species (e.g., SLC)? This is PCM “storage” space; maybe equivalent to PCM-Small or maybe slower and larger (e.g., MLC)?
Prior work & findings • Memory energy savings • Sizable savings of 20~90% [Zhou et al., ’09, Park et al., ’11] • At a manageable performance hit of ~5% or so • Hardware wear leveling feasible [Qureshi et al., ’09, Seong et al., ’10] • Other system implications • Fast system on and off [Doh et al., ’09] • Single-level data store [Venkataraman et al., ’11] • Rapid checkpointing[Dong et al., ’09]
Industry progress: Samsung Chung et al. ISSCC ’11 Techinsightsdecap ’10 Lee et al. ISSCC ’07 Lee et al. JSSC ’08 512Mb @90nm Diode switch design 266MB/s read 4.64MB/s write (x16) 512Mb @60nm? Diode switch design Believed to be a tech.-migrated design 1Gb @58nm LPDDR2-N “Write skewing” 6.4MB/s write “DCWI” (~Flip-N-Write)
Industry progress: Numonyx (Micron) Numerous press releases (slated for MP in 2011) (2011~2012?) Early access program (2009) (Servalli, IEDM ’09) “Alverstone” (OMNEO) 128Mb @90nm TR switch design 40MB/s read (?) <1MB/s write (?) “Bonelli” 1Gb @45nm 1.8V I/O “Imola” and “Mandello” 2Gb & 4Gb @45nm 1.2V & 1.8V I/O LPDDR2-NVM & DDR3-NVM
Our vision • To drastically reduce the power needed by TB capacities for main memory • Cross-cutting, holistic system design • With heterogeneous resources, management tasks are best handled by collaboration of layers • MemVisor
Research questions (infra) • PCM has the potential to beat DRAM in terms of capacity and power… • But what about performance? How much performance is “good enough” for key applications? • What cross-layer information is critical for MemVisor? • What are appropriate interfaces? • Can we predictively allocate different amount of DRAM and PCM to a virtual machine? • Hardware and software support?
Research questions (application) • How can we best utilize persistency in memory? • Extension of storage? How? • New algorithms and data structures? • PCM provides “storage” that is orders of magnitude faster than HDDs • Any changes needed in OS? DBMS? • New algorithms that work synergistically with the underlying hardware and system layers for longer lifetime and higher reliability?
Storage Class Memory Architecturefor Energy Efficient Data Centers www.cs.pitt.edu/PCM