1 / 38

An Analysis of Virtual Channel Memory and Enhanced Memories Technologies

An Analysis of Virtual Channel Memory and Enhanced Memories Technologies. Bill Gervasi Technology Analyst Chairman, JEDEC Memory Parametrics Committee. Agenda. Next Generation PC Controllers Concerns With Standard SDRAM Cached SDRAMs: Enhanced & Virtual Channel

hansel
Download Presentation

An Analysis of Virtual Channel Memory and Enhanced Memories Technologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Analysis of Virtual Channel Memory and Enhanced Memories Technologies Bill Gervasi Technology Analyst Chairman, JEDEC Memory Parametrics Committee

  2. Agenda • Next Generation PC Controllers • Concerns With Standard SDRAM • Cached SDRAMs: Enhanced & Virtual Channel • Controller Complexity vs DRAM type • Pros and Cons of Cached DRAMs • Conclusions & Call to Action

  3. Next Gen PC Controllers CommandFIFOs R1 ReqArb PathArb R2 DataFIFOs R3 I/O DRAM

  4. Requestors • CPU port: Cache fills dominate • DRAM frequency = 1/3 to 1/5 of CPU frequency • Big L2 caches randomize memory accesses • Graphics port: Lots of random accesses • Especially 3D rendering • South Bus port: Mix of short, long packets

  5. SDRAM Roadmap  DDR II  DDR-333  DDR-266  PC-133  PC-100 SDRAM 66 The good news: ever faster cores, power manageable, simple evolutionary changes, at about the same price The bad news: random access time has not changed appreciably, and power is higher than it needs to be

  6. Concerns With SDRAM • Power • Latency • Refresh Overhead

  7. SDRAM Power vs Latency • Active power is very high • Active on power 500X inactive off power • Encourages controllers to close pages • But access time to a closed page is long • Row activation time + column read time

  8. RelativePower CPU ClockLatency** Active on PowerState* 100% 4% 12% 0 x 5 = 0 Inactive on 3 x 5 = 15 Active off 1 x 5 = 5 Inactive off 0.2% 4 x 5 = 20 * Not industry standard terms – simplified for brevity ** Assuming memory clock frequency = 1/5 CPU frequency ClosedPage Sleep OpenPage  0.4% 200 x 5=1000 SDRAM Power Profile

  9. Refresh • Gigabit generation refresh overhead • 256Mb generation is 75ns each 15.6us • 1Gb generation will be 120ns each 7.8us • This is a 3X performance penalty

  10. An Argument for Cached SDRAM Architectures

  11. What Are Cached SDRAMs? Narrow I/O channel DRAM ARRAY SRAM ARRAY WideI/O channel x4 to x32 x256 to x1024 onchip

  12. Cached DRAM architectures can address SDRAM’s key limitations However, only commodity memories are affordable for mass market use I hope to encourage the adoption of cache for all standard future SDRAMs

  13. Cached SDRAM Solutions • Power: Encourage closed page use • Latency: Fast access to closed pages • Refresh: Background operation

  14. Cached DRAM • SRAM cache before DRAM array • Much like CPU onchip caches • Exploit wide internal buses for fast block transfer of DRAM to SRAM and back • Allow DRAM core to be deactivated while… • … I/O performed on SRAM

  15. Two Leading Cached DRAMs • Enhanced SDRAM • SRAM in sense amps • Direct mapped into array • Virtual Channel SDRAM • SRAM in periphery • DRAM associativity maintained by controller Note: Other cached DRAM architectures exist, however none have been proposed as a commodity DRAM standard.

  16. Enhanced & Virtual Channel DRAM array SenseAmps & SRAM(4Kbx4) Segment SenseAmps ChannelSRAM(1Kbx17) I/O I/O

  17. Enhanced Activate Read (& cache fill) Write (array & cache update) Precharge Auto Refresh Self Refresh Virtual Channel Activate Prefetch (cache fill) Read (from cache) Write (cache update) Restore (array update) Precharge Auto Refresh Self Refresh Basic Functions Note: Identical to standard SDRAM

  18. Power Factors • Enhanced • Page open to read to cache • Closed while reading • Open during writes • Virtual Channel • Page open to read to cache • Closed while reading • Closed during writes • Reopen for restore

  19. Enhanced Read automatically fills cache Reads on closed page permitted No write latencies Masking increases write recovery time (normal for SDRAM architectures) Virtual Channel Prefetch for cache fill Reads on closed page permitted Writes on closed page permitted Reactivation to restore Inherently RMW – no penalty for masking Latency Factors

  20. Enhanced Activation not allowed Permits reads during auto refresh Virtual Channel Activation not allowed No reads during auto refresh Refresh Factors

  21. Let’s Compare the Power and Activity Profiles of SDRAM, Enhanced, and Virtual Channel

  22. Closed Page SDRAM Profile Higher Power Power Profile NOP ACT R R PRE NOP Command Activity Power Profile NOP ACT W W PRE NOP Command Activity Lower Power

  23. Enhanced Profile Higher Power Power Profile NOP ACT R-PRE R R NOP Command Activity Power Profile NOP ACT W W PRE NOP Command Activity Lower Power

  24. Virtual Channel Profile Higher Power Power Profile NOP ACT F-PRE R W NOP Command Activity Power Profile NOP R W ACT RST NOP Command Activity Lower Power

  25. Power Profile Factors • Standard L3 cache hit analysis • Profile of memory operations affected by randomness of accesses • Balance of activations & precharges,reads & writes • Requestor channel profile depends on application – games, video, or office app?

  26. Cache Entry Size • Max words burst per hit before reload needed Entry size ¸ bus width ¸ burst length • Miss impacts performance & power • Affects controller association overhead • Die size impacted by entry size

  27. Randomness • Enhanced controllers replace any entry • Physical memory locality determines • Virtual controllers can lock channels, e.g. • Screen refresh channel never replaced • Priorities can be assigned to channels • Weighted replacement algorithms possible

  28. Enhanced Controller Issues • Direct mapping restricts complexity • 4 physical banks x 4 cache lines • Comparitor for cache hit

  29. Virtual Controller Issues • Tags required to do closed page writes • Keep track of where to restore • 4 physical banks x 16 channels • CAM needed to exploit large # channels • CAM routing challenging – metal over array?

  30. Gates Complexity vs Throughput One entry hit logic Closed bank Visible refresh 3 cmd FIFO 4 entries per phys bank Scheduled writeback Hidden refresh 6-8 cmd FIFO Full CAM16 channels per phys bank * 4 phys banks Scheduled writeback with reorder 2 cmd FIFO 1 entry per phys bank Write flush Direct map EMS THROUGHPUT SDRAM VCM 10-20K 35-50K 200+K 100-120K COMPLEXITY

  31. Brian Davis @ University of Michigan analysis

  32. Brian Davis @ University of Michigan analysis

  33. PROS Best performance headroom from cache associativity Flexible channel maps Saves power on reads and writes More hits in random access environment Pin compatible in SDR & DDR I configurations CONS Die penalty from 7-13% Incompatible command set Cannot have one die for standard and virtual Refresh blocks access Simple controller performance < “standard” Small cache entry sizes cause more replacement 2 banks = higher power Virtual Channel

  34. PROS Die penalty from 2-5% Compatible command set Same die supports standard or enhanced Optional features Simple to implement Pin compatible No performance drop even for simple controller Refresh does not block reads 4 banks = lower power CONS Royalty to DRAM suppliers Performance boost lower than VCM max Less flexible maps No power savings on writes Enhanced Memory

  35. User Perspective • Users want a standard cached DRAM architecture – solves real system problems • Lower power for closed bank policy • Lower latency • Hidden refresh • Costs cannot exceed 5% over non-cached

  36. Conclusions • Cached DRAM is highly desirable • Both Enhanced and Virtual offer significant benefits • VCM requires more controller complexity • Low end not well served • Next “commodity” must fit all markets • Enhanced preferred over Virtual Channel • Better chance at becoming “standard” • Compatibility, simple transition lowers adoption risk • Like L2 went direct  set associative, maybe VCM is a good fit for DDR III in 2006/2007

  37. Call to Action • Act now! • Demand vendors provide cached DRAM • Express support for Enhanced style • Commit controller designs Industry likely to decide in March 2001

  38. Thank You

More Related