1 / 32

14:332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Buses and I/O system

14:332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Buses and I/O system. [Adapted from Dave Patterson’s UCB CS152 slides and Mary Jane Irwin’s PSU CSE331 slides]. Head’s Up. This week’s material Buses: Connecting I/O devices Reading assignment – PH 8.4

smcclanahan
Download Presentation

14:332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Buses and I/O system

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 14:332:331Computer Architecture and Assembly LanguageSpring 2006Week 13Buses and I/O system [Adapted from Dave Patterson’s UCB CS152 slides and Mary Jane Irwin’s PSU CSE331 slides]

  2. Head’s Up • This week’s material • Buses: Connecting I/O devices • Reading assignment – PH 8.4 • Memory hierarchies • Reading assignment – PH 7.1 and B.8-9 • Reminders • Next week’s material • Basics of caches • Reading assignment – PH 7.2

  3. Review: Major Components of a Computer Processor Devices Control Output Memory Datapath Input Cache Main Memory Secondary Memory (Disk)

  4. Input and Output Devices • I/O devices are incredibly diverse wrt • Behavior • Partner • Data rate

  5. Magnetic Disk • Purpose • Long term, nonvolatile storage • Lowest level in the memory hierarchy • slow, large, inexpensive • General structure • A rotating platter coated with a magnetic surface • Use a moveable read/write head to access the disk • Advantages of hard disks over floppy disks • Platters are more rigid (metal or glass) so they can be larger • Higher density because it can be controlled more precisely • Higher data rate because it spins faster • Can incorporate more than one platter

  6. Organization of a Magnetic Disk • Typical numbers (depending on the disk size) • 1 to 15 (2 surface) platters per disk with 1” to 8” diameter • 1,000 to 5,000 tracks per surface • 63 to 256 sectors per track • the smallest unit that can be read/written (typically 512 to 1,024 B) • Traditionally all tracks have the same number of sectors • Newer disks with smart controllers can record more sectors on the outer tracks (constant bit density) Sector Platters Track

  7. Track Sector Cylinder Platter Head Magnetic Disk Characteristic • Cylinder: all the tracks under the heads at a given point on all surfaces • Read/write data is a three-stage process: • Seek time: position the arm over the proper track (6 to 14 ms avg.) • due to locality of disk references the actual average seek time may be only 25% to 33% of the advertised number • Rotational latency: wait for the desired sectorto rotate under the read/write head (½ of 1/RPM) • Transfer time: transfer a block of bits (sector)under the read-write head (2 to 20 MB/sec typical) • Controller time: the overhead the disk controller imposes in performing an disk I/O access (typically < 2 ms)

  8. Magnetic Disk Examples

  9. bus I/O System Interconnect Issues • A bus is a shared communication link (a set of wires used to connect multiple subsystems) • Performance • Expandability • Resilience in the face of failure – fault tolerance Processor Receiver Main Memory Keyboard

  10. Performance Measures • Latency (execution time, response time) is the total time from the start to finish of one instruction or action • usually used to measure processor performance • Throughput – total amount of work done in a given amount of time • aka execution bandwidth • the number of operations performed per second • Bandwidth – amount of information communicated across an interconnect (e.g., a bus) per unit time • the bit width of the operation * rate of the operation • usually used to measure I/O performance

  11. I/O System Expandability • Usually have more than one I/O device in the system • each I/O device is controlled by an I/O Controller interrupt signals Processor Cache Memory Memory - I/O Bus I/O Controller I/O Controller I/O Controller Main Memory Terminal Disk Disk Network

  12. Bus Characteristics • Control lines • Signal requests and acknowledgments • Indicate what type of information is on the data lines • Data lines • Data, complex commands, and addresses • Bus transaction consists of • Sending the address • Receiving (or sending) the data Control Lines Data Lines

  13. Step 1: Processor sends read request and read address to memory Control Main Memory Processor Data Step 2: Memory accesses data Control Main Memory Processor Data Step 3: Memory transfers data to disk Control Main Memory Processor Data Output (Read) Bus Transaction • Defined by what they do to memory • read = output: transfers data from memory (read) to I/O device (write)

  14. Step 1: Processor sends write request and write address to memory Control Main Memory Processor Data Step 2: Disk transfers data to memory Control Main Memory Processor Data Input (Write) Bus Transaction • Defined by what they do to memory • write = input: transfers data from I/O device (read) to memory (write)

  15. Advantages and Disadvantages of Buses • Advantages • Versatility: • New devices can be added easily • Peripherals can be moved between computer systems that use the same bus standard • Low Cost: • A single set of wires is shared in multiple ways • Disadvantages • It creates a communication bottleneck • The bus bandwidth limits the maximum I/O throughput • The maximum bus speed is largely limited by • The length of the bus • The number of devices on the bus • It needs to support a range of devices with widely varying latencies and data transfer rates

  16. Types of Buses • Processor-Memory Bus (proprietary) • Short and high speed • Matched to the memory system to maximize the memory-processor bandwidth • Optimized for cache block transfers • I/O Bus (industry standard, e.g., SCSI, USB, ISA, IDE) • Usually is lengthy and slower • Needs to accommodate a wide range of I/O devices • Connects to the processor-memory bus or backplane bus • Backplane Bus (industry standard, e.g., PCI) • The backplane is an interconnection structure within the chassis • Used as an intermediary bus connecting I/O busses to the processor-memory bus

  17. Bus Adaptor Bus Adaptor Bus Adaptor I/O Bus I/O Bus I/O Bus A Two Bus System • I/O buses tap into the processor-memory bus via Bus Adaptors (that do speed matching between buses) • Processor-memory bus: mainly for processor-memory traffic • I/O busses: provide expansion slots for I/O devices Processor-Memory Bus Processor Memory

  18. Bus Adaptor Bus Adaptor I/O Bus Backplane Bus I/O Bus Bus Adaptor A Three Bus System • A small number of Backplane Buses tap into the Processor-Memory Bus • Processor-Memory Bus is used for processor memory traffic • I/O buses are connected to the Backplane Bus • Advantage: loading on the Processor-Memory Bus is greatly reduced Processor-Memory Bus Processor Memory

  19. I/O System Example (Apple Mac 7200) • Typical of midrange to high-end desktop system in 1997 Processor Processor-Memory Bus Cache Memory Audio I/O Serial ports PCI Interface/ Memory Controller Main Memory I/O Controller I/O Controller PCI CDRom I/O Controller I/O Controller SCSI bus Disk Graphic Terminal Network Tape

  20. Example: Pentium System Organization Processor-Memory Bus Memory controller (“Northbridge”) PCI Bus I/O Busses http://developer.intel.com/design/chipsets/850/animate.htm?iid=PCG+devside&

  21. Synchronous and Asynchronous Buses • Synchronous Bus • Includes a clock in the control lines • A fixed protocol for communication that is relative to the clock • Advantage: involves very little logic and can run very fast • Disadvantages: • Every device on the bus must run at the same clock rate • To avoid clock skew, they cannot be long if they are fast • Asynchronous Bus • It is not clocked, so requires handshaking protocol (req, ack) • Implemented with additional control lines • Advantages: • Can accommodate a wide range of devices • Can be lengthened without worrying about clock skew or synchronization problems • Disadvantage: slow(er)

  22. ReadReq 1 2 addr data Data 3 4 Ack 6 5 7 DataRdy Asynchronous Handshaking Protocol • Output (read) data from memory to an I/O device. • Memory sees ReadReq, reads addr from data lines, and raises Ack • I/O device sees Ack and releases the ReadReq and data lines • Memory sees ReadReq go low and drops Ack • When memory has data ready, it places it on data lines and raises DataRdy • I/O device sees DataRdy, reads the data from data lines, and raises Ack • Memory sees Ack, releases the data lines, and drops DataRdy • I/O device sees DataRdy go low and drops Ack I/O device signals a request by raising ReadReq and putting the addr on the data lines

  23. Key Characteristics of Two Bus Standards

  24. Memory Hierarchy Technologies • Random Access • “Random” is good: access time is the same for all locations • DRAM: Dynamic Random Access Memory • High density (1 transistor cells), low power, cheap, slow • Dynamic: need to be “refreshed” regularly (~ every 8 ms) • SRAM: Static Random Access Memory • Low density (6 transistor cells), high power, expensive, fast • Static: content will last “forever” (until power turned off) • Size: DRAM/SRAM ­ 4 to 8 • Cost/Cycle time: SRAM/DRAM ­ 8 to 16 • “Non-so-random” Access Technology • Access time varies from location to location and from time to time (e.g., Disk, CDROM)

  25. bit (data) lines Each intersection represents a 6-T SRAM cell word (row) select Classical SRAM Organization (~Square) r o w d e c o d e r RAM Cell Array Column Selector & I/O Circuits column address row address One memory row holds a block of data, so the column address selects the requested word from that block data word

  26. RAM Cell Array Classical DRAM Organization (~Square Planes) bit (data) lines The column address selects the requested bit from the row in each plane . . . r o w d e c o d e r Each intersection represents a 1-T DRAM cell word (row) select column address Column Selector & I/O Circuits row address . . . data bit data bit data bit data word

  27. RAM Memory Definitions • Caches use SRAM for speed • Main Memory is DRAM for density • Addresses divided into 2 halves (row and column) • RASor Row Access Strobe triggering row decoder • CAS or Column Access Strobe triggering column selector • Performance of Main Memory DRAMs • Latency: Time to access one word • Access Time: time between request and when word arrives • Cycle Time: time between requests • Usually cycle time > access time • Bandwidth: How much data can be supplied per unit time • width of the data channel * the rate at which it can be used

  28. N cols RAS Classical DRAM Operation Column Address • DRAM Organization: • N rows x N column x M-bit • Read or Write M-bit at a time • Each M-bit access requiresa RAS / CAS cycle DRAM Row Address N rows M bits M-bit Output Cycle Time 1st M-bit Access 2nd M-bit Access CAS Row Address Col Address Row Address Col Address

  29. Ways to Improve DRAM Performance • Memory interleaving • Fast Page Mode DRAMs – FPM DRAMs • www.usa.samsungsemi.com/products/newsummary/asyncdram/K4F661612D.htm • Extended Data Out DRAMs – EDO DRAMs • www.chips.ibm.com/products/memory/88H2011/88H2011.pdf • Synchronous DRAMS – SDRAMS • www.usa.samsungsemi.com/products/newsummary/sdramcomp/K4S641632D.htm • Rambus DRAMS • www.rambus.com/developer/quickfind_documents.html • www.usa.samsungsemi.com/products/newsummary/rambuscomp/K4R271669B.htm • Double Data Rate DRAMs – DDR DRAMS • www.usa.samsungsemi.com/products/newsummary/ddrsyncdram/K4D62323HA.htm • . . .

  30. Memory Bank 0 Memory Bank 1 CPU Memory Bank 2 Memory Bank 3 Access Bank 1 Access Bank 0 Access Bank 2 Access Bank 3 We can Access Bank 0 again Increasing Bandwidth - Interleaving Access pattern without Interleaving: Cycle Time CPU Memory Access Time D1 available Start Access for D1 D2 available Start Access for D2 Access pattern with 4-way Interleaving:

  31. Problems with Interleaving • How many banks? • Ideally, the number of banks  number of clocks we have to wait to access the next word in the bank • Only works for sequential accesses (i.e., first word requested in first bank, second word requested in second bank, etc.) • Increasing DRAM sizes => fewer chips => harder to have banks • Growth bits/chip DRAM : 50%-60%/yr • Only can use for very large memory systems (e.g., those encountered in supercomputer systems)

  32. N x M “SRAM” M bits 1st M-bit Access 2nd M-bit 3rd M-bit 4th M-bit RAS CAS Row Address Col Address Col Address Col Address Col Address Fast Page Mode DRAM Operation Column Address • Fast Page Mode DRAM • N x M “SRAM” to save a row N cols DRAM Row Address • After a row is read into the SRAM “register” • Only CAS is needed to access other M-bit blocks on that row • RAS remains asserted while CAS is toggled N rows M-bit Output

More Related