480 likes | 629 Views
3. Overall design space of main memories. Dezső Sima September 2008. (Ver. 1.0). Sima Dezső, 2008. Instruction Set Architecture (ISA). Micro- architecture. Underlying principle of operation. Underlying principles of implementation. Principles of attaching memory and I/O.
E N D
3. Overall design space of main memories Dezső Sima September 2008 (Ver. 1.0) SimaDezső, 2008
Instruction Set Architecture (ISA) Micro- architecture Underlying principle of operation Underlying principles of implementation Principles of attaching memory and I/O Von Neumann computational model Figure: Design Space of processors
Control Set Architecture (CSA) Micro- architecture of the MM Underlying principle of operation Underlying principles of implementation Figure: Design Space of main memories (MM)
Underlying principle of operation Refreshing (not discussed) Basic operation Figure: Underlying principle of operation of DRAM devices
C AD AB AR C AD AB AC C AD AR AB Read data (RD) tRAC tRAC tRAC t C AD AB AR C AD AB AC C AD AR AB Write data (WD) tRAC tRAC tRAC t Basic operation of DRAM devices (Assuming device/bank/row/column addressing) Reads Activate Read Precharge C: Command AD: Device address AB: Bank address AR: Raw address AC: Column address Writes Activate Write Precharge Basic operation of DRAM devices
Underlying principles of the implementation of MMs One/two level implementation Multiplexing commands, addresses and data Bus topology Type of signaling Managing the DRAM status Principle of communication Bus width Type of synchronisation Figure: Main dimensions of the design space of the underlying principles of implementation of MMs
One/two-level implementation One-level implementation Two-level implementation MM is built up of modules, modules are built up of DRAM devices MM is built up of DRAM devices Type of mounting Typically soldered Typically socketed Expandability Not expandable Easily expandable Board space requirement Large boardspace Small boardspace Signal integrity Good signal integrity Unfavorable signal integrity (Earliest PC main memories) XDR memories All other types of main memories E.g. Figure: One/two level implementation of main memories
Managing DRAM status Detached from the basic operation Along with the basic operation (via a second dedicated interface) All other types of main memories RDRAM XDR Figure: Options to manage DRAM status This dimension of the design space is not discussed.
Assumptions for multiplexing commands, addresses and data Commands and addresses • are unidiredctional • (they flow in one direction, from the MC to the MM) • they are transferred on the same communication principle • Data • is bidirectional • (read data flow from the MC to the MM, write data from the MM to the MC) • is transferred separately from the addresses/commands
AR/AC multiplexing AR/AC separate AR/AC multiplexed First DRAMs (before the MK4086) DRAM (asynchr.) (from the MK4096 on) DW/DR multiplexed Synchr. SDRAMs DW/DR separate (unidirectional) DW/DR Not multiplexed (bi-directional) Figure: Multiplexing row and column addresses (AR/AC) vs read and write data (RA/WA)
01 11 0 1 E.g: 16 cycles Principle of communication Packet-based in a number of cycles Via a parallel bus in a single cycle 01 1 cycle E.g: 4 cycles Packet transfer over a one bit wide data path 0 Figure: Principles of communication used in main memories
1 0 1 1 0 0 t MC MC MC Principle of communication Packet-based in a number of cycles Via a parallel bus in a single cycle E.g: 16 cycles 01 11 t 01 t E.g: 4 cycles Packet transfer over a one bit wide data path Figure: Principles of communication used in main memories
DRAM DRAM DRAM DRAM DRAM MC MC MC DRAM D I M M D I M M D I M M D I M M DIMM MC MC MC Bus topology Multi-drop Point-to-point Stub-bus Fly-by Daisy-chained Attaching DRAM devices (soldered) Attaching DIMMs (socketed) Unfavorable (due to TL discontinuities) Signal integrity Better Good Excellent Up to 16 Gb/s (with increasingly sophisticated termination) Peak transfer rate (recently) Up to 4.8 Gb/s Up to 4.8 Gb/s Up to --- Gb/s Figure: Bus topologies used to connect RQAM devices or modules to the memory controller
Bus width Parallel bus Pentium 32 64 Serial bus Width of serial bus Transmission Parallel-based Parallel
Data bus Multi-drop Point-to-point Stub-bus Fly-by Daisy-chained Modules SDRAM DDR DDR2 Stub-bus Devices on the module DDR3 Devices XDR XDR2 Devices RDRAM Devices ? Multi-drop Fly-by Modules FB-DIMM Daisy-chnd Modules Address/control bus P2P TBI Figure: Bus topologies used to attach DRAM devices or DIMMs
Capturing control/address information Central synchronization Source synchronization Mesochronous synchronization Mesochron. synch. SDRAM DDR/2/3 CRDRAM RDRAM Capturing control/address information XDR ? XDR2 ? Source synch. Central synch. TBI ? Figure: Synchronisation alternatives
S+ VCM VREF S- t t t Signals Voltage referenced Differential Open ended HVDS SCSI-1 TTL (5 V) PCI SSTL SSTL2 (DDR) SSTL1.8 (DDR2) SSTL1.5 (DDR3) AGP2.0 (1.5 V) AGP3.0 (0.8 V) LVDS Hypertransport SATA Ultra-2 SCSI and later PCI-E LVTTL (3.3 V) SDRAM PCI PCI-X AGP1.0 Higher data rates LVTTL: Low Voltage TTL LVDS: Low Voltage Differential Signaling HVDS: High Voltage Differential Signaling SSTL: Stub Series Terminated Logic VREF: Reference Voltage VCM: Common Mode Voltage Figure: Different kinds of signals used in buses or interfaces
Serial connected RDRAMs XDRs FB-DIMMs DIMMs Devices Devices RIMMs Consumer(PS2) Desktop (PIII/P4) Consumer (PS3) Servers (QS20/21) Servers (Intel’s 5000/7000, Sun’s Niagara II) Aimed at: Micron Produced by Elpida Elpida Samsung Samsung Toshiba Qimonda Qimonda Nanya Hynix Figure: Use and production of serial connected DRAMs
Principle of operation (1) The set of buses defined Own buses for • data, • memory requests and • control register (CR) read/writes. Designation of the buses • Data bus: DQ/N [15:0] • Request bus: RQ [11:0] • CR reads/writes: serial bus • (SCK, CMD, SDI, SDO, RST)
Comparison: The set and direction of buses defined in major memory types (Direction is interpreted from the point of view of the memory device/module) XDR • read/write data (I/O), DQ [15:0] • memory requests (I) and RQ [11:0] • control register (CR) reads (O), SDI • control register (CR) writes (I). SDO FB-DIMM (between the AMBs and the memory controller) • read data/device status (O), PN [13:0] • memory requests/write data/CR reads or writes (I) PS [9:0] Synchronous DRAMs • read/write data (I/O), DQ [3:0/7:0/15:0] • commands (set of individual command lines (I) CS, RAS. CAS, WE • addresses (bank address/address within a bank) (I) BA [7:0], A [N:0]
Principle of operation (1) Topology of the buses interconnecting the memory controller and the XDR devices • Point-to-point topology for the data bus, • Fly-by topology for the system clock, request bus and serial bus.
Data packets CC R/W packets Control packets Principle of operation (1) 1/1 Point-to-point topology for the data bus rather than a multidrop or daisy chained topology. Point-to-point data bus Data from only one device can be accessed Good signal integrity Small memory size High data rate of 3.2...4.8 Gb/s Figure: Point-to-point implementation of the data bus [4]
Point-to-point Figure: Implementation of a two-channel XDR memory with two XDR devices/channel [6]
Data packets Memory controller M. module M. module CC R/W packets Control packets M. module [4] XDR FB-DIMM Daisy chained connection Point-to-point connection Data from multiple modules can be accessed Data from only one device can be accessed Good signal integrity Good signal integrity High memory size Small memory size High data rate of 3.2...4.8 Gb/s High data rate of 3.2...4.8 Gb/s Figure: Contrasting the point-to-point and daisy chained bus implementations of the data bus
Figure: Daisy chained connection of the AMBs in FB-dIMMs [7] (There are two Command/Address buses (C/A) to reduce loading coming from 9 to 36 DRAMs mounted on the module)
Note Concerning the point of termination the daisy chained connection appears like a point-to-point connection, since in this case the controller „sees” only the first memory device/module whereas further devices/modules are hidden from the controller via the repeater chain feature of the daisy chain topology and vice versa.
Flying-by Command-, Address-, Control-, and CK, CK# signals ODT terninated DQ, DQS/#, DM signals Point-to-point connection Stub-bus connection Large memory size Data from multiple modules can be accessed Small memory size Data from only one device can be accessed Low data rate of 0.8...1.6 Gb/s Good signal integrity Unfavourable signal integrity High data rate of 3.2...4.8 Gb/s Figure: Contrasting the point-to-point and multidrop bus implementations of the data bus
Principle of operation (1/2) Fly-by topology for the • request and • CR read/write buses Request bus: RQ [11:0] CR reads/writes: SDI, SDO
Comparison: Bus topologies chosen for the major memory types Bus topology XDR • read/write data (I/O) • memory requests (I) • control register (CR) reads (O) • control register (CR) writes (I) DQ [15:0] RQ [11:0] SDI SDO Point-to-point Fly-by Fly-by Fly-by FB-DIMM (AMBs - memory controller) • read data/device status (O) • memory requests/ • write data/CR reads or writes (I) PN [13:0] PS [9:0] Daisy-chained Daisy-chained Synchronous DRAMs (except DDR3) • read/write data (I/O) • commands (I) • addresses (I) DQ [3:0/7:0/15:0] CS, RAS. CAS, WE BA [7:0], A [N:0] Stub bus Stub bus Stub bus DDR3 Stub bus Fly-by Fly-by • read/write data (I/O) • commands (I) • addresses (I) DQ [3:0/7:0/15:0] CS, RAS. CAS, WE BA [7:0], A [N:0]
Bus topology Signaling XDR Differetial Volt. ref. Volt. ref. Volt. ref. • read/write data (I/O), DQ [15:0] • memory requests (I) and RQ [11:0] • control register (CR) reads (O) SDI • control register (CR) writes (I) SDO Point-to-point Fly-by Fly-by Fly-by
Comparison: Signaling chosen for the major memory types DQ [15:0] RQ [11:0] SDI SDO PN [13:0] PS [9:0] DQ [3:0/7:0/15:0] CS, RAS. CAS, WE BA [7:0], A [N:0] DQ [3:0/7:0/15:0] CS, RAS. CAS, WE BA [7:0], A [N:0]
Comparison: Signaling chosen for the major memory types Signaling Bus topology XDR Differential Volt. ref. Volt. ref. Volt. ref. • read/write data (I/O), DQ [15:0] • memory requests (I) and RQ [11:0] • control register (CR) reads (O) SDI • control register (CR) writ(I) SDO Point-to-point Fly-by Fly-by Fly-by FB-DIMM (AMBs - memory controller) Differential Differential • read data/device status (O) PN [13:0] • memory requests/ PS [9:0] • write data/CR reads or writes (I) Daisy-chained Daisy-chained Synchronous DRAMs (except DDR3) • read/write data (I/O) DQ [3:0/7:0/15:0] • commands (I) CS, RAS. CAS, WE • addresses (I) BA [7:0], A [N:0] Stub bus Stub bus Stub bus Volt. ref. Volt. ref. Volt. ref. DDR3 Stub bus Fly-by Fly-by • read/write data (I/O) DQ [3:0/7:0/15:0] • commands (I) CS, RAS. CAS, WE • addresses (I) BA [7:0], A [N:0] Volt. ref. Volt. ref. Volt. ref.
Comparison: Signaling in the major memory types Signaling Buses Bus topology XDR Differential Volt. ref. Volt. ref. Volt. ref. • read/write data (I/O) • memory requests (I) • control register (CR) reads (O) • control register (CR) writes (I) DQ [15:0] RQ [11:0] SDI SDO Point-to-point Fly-by Fly-by Fly-by FB-DIMM (AMBs - memory controller) Differential Differential • read data/device status (O) • memory requests/ • write data/CR reads or writes (I) PN [13:0] PS [ 9:0] Daisy-chained Daisy-chained Synchronous DRAMs (except DDR3) • read/write data (I/O) • commands (I) • addresses (I) DQ [3:0/7:0/15:0] CS, RAS. CAS, WE BA [7:0], A [N:0] Stub bus Stub bus Stub bus Volt. ref. Volt. ref. Volt. ref. DDR3 Stub bus Fly-by Fly-by • read/write data (I/O) • commands (I) • addresses (I) DQ [3:0/7:0/15:0] CS, RAS. CAS, WE BA [7:0], A [N:0] Volt. ref. Volt. ref. Volt. ref.
Flying-by Command-, Address-, Control-, and CK, CK# signals ODT terninated DQ, DQS/#, DM signals DDR3 XDR Parallel bus based Comm. principle Packet based Signaling PTP, differential (DRSL) Bus, voltage ref. (SSTL) Data Comm./Addr. Bus, fly-by, volt. ref. (SSTL) Bus, fly-by, volt. ref. (RSL) Fly-by, diff. (DRSL) Fly-by, diff. ( diff. SSTL) Clock n.a. Contr. reg. manip. Serial 1-bit, volt. ref. (RSL) Read/write leveling FlexPhase Synchron. Figure: Contrasting communication and synchronisation in XDR and DDR3 memories [4], [9]
Data packets Request packets CC R/W packets Principle of operation (2) Packet based communication between the memory controller and the XDR devices (like in FB-DIMM modules) Packets • Data packets over the DQ/N lines • Request packets over the RQ lines • CC R/W packets over the serial if. Interface lines • DQ/N [15:0]: Data lines • RQ [11:0]: Request lines • CFM/N: Clock From Master • SCK... Serial interface CC: Control Register R/W: Read/Write /N: Negative signal Figure: Principle of operation [4]
Flying-by Command-, Address-, Control-, and CK, CK# signals ODT terninated DQ, DQS/#, DM signals DDR3 XDR Parallel bus based Comm. principle Packet based Signaling PTP, differential (DRSL) Bus, voltage ref. (SSTL) Data Comm./Addr. Bus, fly-by, volt. ref. (SSTL) Bus, fly-by, volt. ref. (RSL) Fly-by, diff. (DRSL) Fly-by, diff. ( diff. SSTL) Clock n.a. Contr. reg. manip. Serial 1-bit, volt. ref. (RSL) Read/write leveling FlexPhase Synchron. Figure: Contrasting communication and synchronisation in XDR and DDR3 memories [4], [9]
Data packets CC R/W packets Control packets Remark XDR FB-DIMM Memory controller Southbound packets Northbound packets • Commands • Write data • Read data • Status M. module M. module M. module [4] Figure: Contrasting the packet concepts of XDR and FB-DIMM memories (1)
Contrasting the packet concepts of XDR and FB-DIMM memories (2) Both XDR and FB-DIMM memories use packet based communication between the memory controller and the XDR devices. Differences in the packet policies XDRs „Clean” packets of • memory access and maintenace commands (termed request packets), • data, • control register read/write commands. FB-DIMMs „Clean” packets of • read data or status packets (termed as northbound packets). Mixed packets of • commands and write data (termed as southbound packets).
Principle of operation(2) • The memory controller sends request packets to the XDR devices, • the XDR devices satisfy these requests, e.g. by sending read data packets • to the memory controller. Basic command sequence the same as for synchronous DRAMs Activate – Read/Write - Precharge
Example 1 Operation Activate Bank a, Row a Read Bank a, Column a1 Read Bank a, Column a2 Precharge Bank a Read data packet Q(a1) Read data packet Q(a2) Figure: Example for reading from the XDR device [3]
Example 2 Operation Activate Bank a, Row a Write Bank a, Column a1 Write Bank a, Column a2 Precharge Bank a Read data packet Q(a1) Read data packet Q(a2) Figure: Example for writing to the XDR device [3]
Mem. Size GB x x 512 500 7300 (4) SunT2 (4D) Core 2 Servers, T2 200 192 FB-DIMM DDR-2 100 50 DDR (reg) x 48 5100 (2) x 32 7520 (2) P4 Servers 20 x x P4 Servers, QS22 16 7501 (2) 7520 (2) DDR2 (reg) 10 DDR2 x x x 8 P4 Desktops 925X (2) P35 (2) QS22 (2) (2D) P4 Desktops 5 x 4 Core 2 Desktop 875 (2) SDRAM 3 DDR x x 845 (1) 840 (2) x 2 845 (1) RDRAM x x x 1 RDRAM QS21 (2D) 820 (1) 850 (1) x 860 (2) 0,75 XDR Servers 850E (1) x 0,5 QS20 (2D) P4 Servers P4 Desktops 1,06 1,6 3,2 6,4 8,5 10,6 12,8 21,2 25,6 4,2 51,2 BW GB/s 1 10 20 30 2 5 50 Figure: Peak memory size vs peak bandwidth (BW) of particular DRAM technologies in Intel’s chipsets, IBM’s QS2x blades and Sun’s T2