1 / 40

Input / Output

Input / Output. CPS 104 Week 14 lecture 1. Administrivia. HW 5 Due HW 6 Assigned Due last day of class. Overview. I/O devices device controller Rotational media (disks) Device drivers Memory Mapped I/O Programmed I/O Direct Memory Access (DMA) I/O bus  memory bus RAID (if time).

booker
Download Presentation

Input / Output

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Input / Output CPS 104 Week 14 lecture 1

  2. Administrivia • HW 5 Due • HW 6 Assigned • Due last day of class CPS 104

  3. Overview • I/O devices • device controller • Rotational media (disks) • Device drivers • Memory Mapped I/O • Programmed I/O • Direct Memory Access (DMA) • I/O bus  memory bus • RAID (if time) CPS 104

  4. interrupts Processor Cache Memory Bus I/O Bridge I/O Bus Main Memory Disk Controller Graphics Controller Network Interface Graphics Disk Disk Network I/O Systems Time(workload) = Time(CPU) + Time(I/O) - Time(Overlap) CPS 104

  5. Why I/O? • Interactive Apps • Long term storage (files, data repository) • Swap for VM • Many different devices • character v.s. block • Networks are everywhere! • 106 difference CPU (10 -9) & I/O (10 -3) • Response Time vs Throughput • Not always another process to execute • OS hides (some) differences in devices • same (similar) interface to many devices • Permits many apps to share one device CPS 104

  6. Device Drivers • top-half • API (open, close, read, write, ioctl) • I/O Control (IOCTL, device specific arguments) • bottom-half • interrupt handler • communicates with device • resumes process • Must have access to user address space and device control registers => runs in kernel mode. CPS 104

  7. Review: Interrupts and Exceptions • Unnatural change in control flow • Interrupt is external event • devices: disk, network, keyboard, etc. • clock for timeslicing • these are useful events, must do something when they occur. • Exception is often potential problem with program • segmentation fault • bus error • divide by 0 • don’t want my bug to crash the entire machine • page fault (virtual memory…) CPS 104

  8. User Program ld add st mul beq ld sub bne Interrupt Handler RETT Review: Handling an Interrupt/Exception • Invoke specific kernel routine based on type of interrupt • interrupt/exception handler • Must determine what caused interrupt • could use software to examine each device • PC = interrupt_handler • Vectored Interrupts • PC = interrupt_table[i] • Clear the interrupt • kernel initializes table at boot time • May return from interrupt (RETT) to different process (e.g, context switch) Service Routines CPS 104

  9. Types of Storage Devices • Magnetic Disks • Magnetic Tapes • CD ROM • Juke Box (automated tape library, robots) CPS 104

  10. Magnetic Disks • Long term nonvolatile storage • Another slower, less expensive level of memory hierarchy Track Sector Arm Cylinder Platter Head CPS 104

  11. Disk Access • Access time = queue + seek + rotational + transfer+ overhead • Seek time • move arm over track • average is confusing (startup, slowdown, locality of accesses) • Rotational latency • wait for sector to rotate under head • average = 0.5/(3600 RPM) = 8.3ms • Transfer Time • f(size, BW bytes/sec) CPS 104

  12. Disk Access Time Example • Disk Parameters: • Transfer size is 8K bytes • Advertised average seek is 12 ms • Disk spins at 7200 RPM • Transfer rate is 4 MB/sec • Controller overhead is 2 ms • Assume that disk is idle so no queuing delay • What is Average Disk Access Time for a Sector? • Ave seek + ave rot delay + transfer time + controller overhead • 12 ms + 0.5/(7200 RPM/60) + 8 KB/4 MB/s + 2 ms • 12 + 4.15 + 2 + 2 = 20 ms • Advertised seek time assumes no locality: typically 1/4 to 1/3 advertised seek time: 20 ms => 12 ms CPS 104

  13. DRAM as Disk • Solid state disk, Expanded Storage, NVRAM • Disk is slow, DRAM is fast => replace Disk with battery backed DRAM • BUT, Disk is cheap, much cheaper than DRAM • Network Memory • fast networks (e.g., Myrinet) • use DRAM of other workstations as backing store • Trapeze/GMS project here CPS 104

  14. Alternative Storage • CD ROM • read only: good distribution, archiving • Magnetic Tape • Sequential Access • R-DAT (Rotating Digital Audio Tape) • Helical Scan (angle to tape, high density ~5GB) • Tera to peta bytes of storage (NASA EOS) CPS 104

  15. Connecting I/O Devices to CPU/Memory • Memory Bus • Short • Fast • Known set of components • Proprietary (don’t release design free) • Separate I/O Bus (e.g., PCI) • Standard • Accept variety of components (w/ different BW performance) • Long • Slow CPS 104

  16. Processor Interface Issues • Interconnections • Busses • Processor interface • I/O Instructions • Memory mapped I/O • I/O Control Structures • Device Controllers • Polling/Interrupts • Data movement • Programmed I/O / DMA • Capacity, Access Time, Bandwidth CPS 104

  17. Device Controller Command Status Data 0 Data 1 Data n-1 Device Controllers Interrupt? Busy Done Error Bus Controller deals with mundane control (e.g., position head, error detection/correction) Processor communicates with Controller Device CPS 104

  18. CPU Memory memory bus Independent I/O Bus Separate I/O instructions (in,out) Controller Controller Device Device CPU Lines distinguish between I/O and memory transfers common memory & I/O bus 40 Mbytes/sec optimistically 10 MIP processor completely saturates the bus! VME bus Multibus-II Nubus Memory Controller Controller Device Device I/O Instructions CPS 104

  19. CPU Memory Controller Controller Device Device CPU $ Device Controller L2 $ Memory Bus I/O bus Memory Bus Adapter Memory Mapped I/O Physical Address Single Memory & I/O Bus No Separate I/O Instructions ROM RAM I/O Issue command through store instruction Check status with load instruction Caches? Bridge CPS 104

  20. Communicating with the processor • Polling • can waste time waiting for slow I/O device • busy wait • can interleave with useful work • Interrupts • interrupt overhead • interrupt could happen anytime - asynchronous • no busy wait CPS 104

  21. Data Movement • Programmed I/O • processor has to touch all the data • too much processor overhead • for high bandwidth devices (disk, network) • DMA • processor sets up transfer(s) • DMA controller transfers data • complicates memory system CPS 104

  22. Is the data ready? no yes CPU load data $ Device store data Controller L2 $ Memory Bus done? no I/O bus Memory Bus Adapter Programmed I/O & Polling busy wait loop not an efficient way to use the CPU unless the device is very fast! but checks for I/O completion can be dispersed among computationally intensive code yes CPS 104

  23. CPU $ Device Controller L2 $ Memory Bus I/O bus Memory Bus Adapter Interrupt Driven Data Transfer add sub and or nop user program (1) I/O interrupt (2) save PC (3) interrupt service addr read store ... rti interrupt service routine User program progress only halted during actual transfer Interrupt Overhead can dominate transfer time. 1000 xfers of 1000 bytes each: 2usecs for interrupt 98usecs for service Device xfer rate: 10 MB/s => .1usec/byte => .1ms for 1000 bytes (4) memory CPS 104

  24. 0 ROM RAM Peripherals DMAC n Direct Memory Access CPU sends a starting address, direction, and length count to DMAC. Then issues "start". Time to do 1000 x 1000 bytes: 1 DMA set-up sequence @ 50 µsec 1 interrupt @ 2 µsec 1 interrupt service sequence @ 48 µsec .0001 second of CPU time CPU $ L2 $ Memory Mapped I/O Memory Bus I/O bus Memory Bus Adapter DMA CNTRL DMAC provides handshake signals for device controller, and memory addresses and handshake signals for memory. CPS 104

  25. I/O Data Flow Impediment to high performance: multiple copies, complex hierarchy CPS 104

  26. Communication Networks Performance limiter is memory system, OS overhead, not HW protocols • Send/receive queues in processor memories • Network controller copies back and forth via DMA • No host intervention needed • Interrupt host when message sent or received CPS 104

  27. Relationship to Processor Architecture • Virtual memory frustrates DMA • page faults during DMA? • Synchronization between controller and CPU • Caches required for processor performance cause problems for I/O • Flushing is expensive, I/O pollutes cache • Solution is borrowed from shared memory multiprocessors "snooping” (coherent DMA) • Caches and write buffers • need uncached and write buffer flush for memory mapped I/O CPS 104

  28. BG BR BGi BGo BGi BGo BGi BGo M M M A.U. BR BR BR Bus Arbitration Parallel (Centralized) Arbitration Serial Arbitration (daisy chaining) Self Selection Collision Detection Bus Request Bus Grant BR BG BR BG BR BG M M M CPS 104

  29. Bus Options Option High performance Low cost Bus width Separate address Multiplex address & data lines & data lines Data width Wider is faster Narrower is cheaper (e.g., 32 bits) (e.g., 8 bits) Transfer size Multiple words has Single-word transfer less bus overhead is simpler Bus masters Multiple Single master (requires arbitration) (no arbitration) Split Yes—separate No—continuous transaction? Request and Reply connection is cheaper packets gets higher and has lower latency bandwidth (needs multiple masters) Clocking Synchronous Asynchronous CPS 104

  30. Address Data Read Req. Ack. Master Asserts Address Next Address Master Asserts Data 4 Cycle Handshake t0 t1 t2 t3 t4 t5 Asynchronous Handshake Write Transaction t0 : Master has obtained control and asserts address, direction, data Waits a specified amount of time for slaves to decode target\ t1: Master asserts request line t2: Slave asserts ack, indicating data received t3: Master releases req t4: Slave releases ack CPS 104

  31. Address Data Read Req Ack Master Asserts Address Next Address 4 Cycle Handshake t0 t1 t2 t3 t4 t5 Read Transaction t0 : Master has obtained control and asserts address, direction, data Waits a specified amount of time for slaves to decode target\ t1: Master asserts request line t2: Slave asserts ack, indicating ready to transmit data t3: Master releases req, data received t4: Slave releases ack Time Multiplexed Bus: address and data share lines CPS 104

  32. Disk Product Families Conventional: 4 disk designs 14” 3.5” 5.25” 10” High End Low End Disk Array: 1 disk design 3.5” Manufacturing Advantages of Disk Arrays CPS 104

  33. Redundant Arrays of Disks • Files are "striped" across multiple spindles • Redundancy yields high data availability Disks will fail Contents reconstructed from data redundantly stored in the array Capacity penalty to store it Bandwidth penalty to update Mirroring/Shadowing (high capacity cost) Horizontal Hamming Codes (overkill) Parity & Reed-Solomon Codes Failure Prediction (no capacity overhead!) Techniques: CPS 104

  34. Summary • I/O devices • device controller • Rotational media (disks) • Device drivers (two parts) • help isolate specifics of device • Memory Mapped I/O • Programmed I/O • Direct Memory Access (DMA) • I/O bus  memory bus • RAID CPS 104

  35. Homework 6

  36. Interrupt Handler • MIPS/SPIM program • Use memory-mapped I/O • Use interrupts • Program should: • Accept keyboard input • interrupts • Echo input to terminal • polling • Exit if user typed ‘q’ • Programmed I/O? CPS 104

  37. Terminal Control • Memory mapped I/O • use LW, SW • -mapped_io command line option • Receiver - input • ready=1 when data valid • Transmitter • ready=1 when ready to print next char CPS 104

  38. U n u s e d 1 1 R e c e i v e r c o n t r o l ( 0 x f f f f 0 0 0 0 ) I n t e r r u p t R e a d y e n a b l e Interrupt Driven I/O • Set Interrupt Enable = 1 • generates a level 0 interrupt when Ready becomes 1 • if interrupt is enabled in Status Register also • Run spim with -notrap • allows you to install interrupt handler CPS 104

  39. Status Register • Bit 0 = interrupt enable • Bit 8 = allow level 0 interrupts • terminal input generates level 0 int. • Coprocessor 0, register 12 • use mfc0, mtc0 • On interrupt, bits 0-5 are shifted left by 2 • disables interrupts and enters kernel mode • When done servicing interrupt, use rfe to restore CPS 104

  40. Cause Register • Code 0000 = external interrupt • terminal interrupt 1 5 1 0 5 2 P e n d i n g E x c e p t i o n i n t e r r u p t s c o d e CPS 104

More Related