1 / 44

Digital Interface Design

Learn about designing interfaces in computer systems, from basic principles to advanced protocols and interfaces such as SPI, I2C, PCIe, and more. Understand metrics like bandwidth, latency, and control overhead, essential for optimizing performance. Dive into datapath and control concepts, synchronization methods, and interface design for CPUs. Explore reuse, standardization, modeling, verification, and debugging in interface development.

mmount
Download Presentation

Digital Interface Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digital Interface Design EECS150 Fall 2008 – Lecture #23 Greg Gibeling Slides adapted from everywhere EECS150 Lecture #23

  2. Motivation • Any useful system includes at least two interfaces: input and output • In a computer: keyboard & screen • In your project: audio & video • The most difficult work in any system is matching incompatible interfaces • Compare CS70 and CS61B • Compare K-maps or adder design and your project • You will be designing interfaces • Either hardware or software • The basic ideas presented here apply fairly widely EECS150 Lecture #23

  3. Outline • Quick Review: SDRAM and Audio • Principles • Metrics: Bandwidth, Latency, Pin Count & Logic Overhead • Datapath & Control (States & Events) • Synchronization: Clock & Reset • Handshaking (Ready/Valid) • Protocols (structure, syntax, sematics) • Interfaces • Simple Interfaces: SPI, I2C, UART, N64 • Intermediate Interfaces: LCD, Ethernet (10M-10G), Interchip • CPU Interfaces: ISA, PCIe • Design • Back to principles • Reuse & Standardization • Modeling, Verification & Debugging EECS150 Lecture #23

  4. Quick Review (1 of 4) • So What? • Almost everything needs storage • Lots of space -> DRAM • SDRAM • SDRAM is BIG • Time multiplex address lines • 2 Dimensional Address (Row & Column) • Often Shared • Arbitration for access • Affects performance EECS150 Lecture #23

  5. Quick Review (2 of 4) • SDRAM (cont) • Steps to Read/Write • Send Row Address (RAS) • Send Column Address (CAS) • Send/Get Data (For 2,4,8 cycles) • Wait (precharge, autorefresh, etc) • Synchronous Interface • Uses a clock & bursts to increase bandwidth • Control requires precise timing • Issue sequences of commands • Timing must be matched to clock frequency EECS150 Lecture #23

  6. Quick Review (3 of 4) • So what? • Example data stream • Low bandwidth • Includes control • Audio • Primary interfaces are analog • Audio is analog • Mixers, etc… • Bit Serial • Low & fixed bandwidth • Low complexity • Expandable (e.g. 5.1, 7.1) EECS150 Lecture #23

  7. Quick Review (4 of 4) • Audio (cont) • Driver • Pair of shifters • Simple sync framing • Control • Abstract registers • Highly stateful • VERY low bandwidth EECS150 Lecture #23

  8. Metrics (1 of 3) • So What? • We need some way to judge good vs bad • Allows us to compare interfaces without guessing • Evaluate tradeoffs and requirements in a formal manner • Objective Metrics • Bandwidth • Latency • Pin Count • State & Logic Overhead • Subjective Metrics • Documentation • Ease of use or debugging • Elegance EECS150 Lecture #23

  9. Metrics (2 of 3) • Bandwidth • High or Low • Higher is always better, but e.g. humans can only hear so much • Video, Audo are classic, but programs need instructions, which means DRAM bandwidth • Fixed or Variable • Raw video or audio have fixed bandwidth, compression (e.g. MP3) can make this vary • Network bandwidth varies because of sharing • Latency • High or Low • Lower is usually better • If there’s no elastic buffer (no way to say “I’m not ready”) • This can cause data loss or require extra buffering, which is costly • Humans are very sensitive to gross latency • Generally reducing latency is VERY HARD without affecting the clock rate • Fixed or variable • Generally referred to as “jitter” • E.g. on VOIP phones, Audio is fixed latency, network is variable, so we have a problem EECS150 Lecture #23

  10. Metrics (3 of 3) • Pincount • Fast becoming a major problem • Chip area grows with N2, Pins are N for DIPs or N2 for BGAs • Either way pins are just physically large • They require a lot of area • They are slow and power hungry • Serial vs Parallel • Old: Parallel for high bandwidth • New: Serial for high bandwidth • What changed? • State & Logic Overhead • This is where major cost & complexity come into play • The bigger the circuit the more places to have a bug • Also affects power, yield and price • Interfaces can be very large • For example DDR2 SDRAM on a Virtex2 Pro • The FPGA couldn’t support the clocking/handshaking easily • Required an incredible amount of logic to make up for this • Never very reliable as a result EECS150 Lecture #23

  11. Datapath & Control (1 of 4) • So What? • Separates the data & control • Allows us to understand the meaning of signals • Separates timing from dataflow • Datapath • Variable information not known until runtime • Regular structure or meaning (e.g. all integers) • Easy to design and debug • Control • Circuits which deal with meaning and timing • Small, irregular and complicated • Difficult to design and debug, even harder to extend EECS150 Lecture #23

  12. Datapath & Control (2 of 4) • Datapath Signals • Wires which carry a value with temporal significance • Form the backbone of the datapath • May include “control” values • E.g. that this is a value to be written to DRAM • This is common in “data stationary control” • Coding • Common Codes • Binary: easy to understand, easy to work with • One-hot: allows inexpensive decoding • Gray Code: asynchronous logic, one bit change at a time • Other issues: state coding, floating point, etc EECS150 Lecture #23

  13. Datapath & Control (3 of 4) • Control Signals • Wires which carry timing, but little data • Form the backbone of the control logic • Enables, resets, and so forth fall into this category • Event Coding • Edge (neg or pos) • Generally we only use the clock edge in FPGA designs • Latch based designs use edges all the time, of course • Pulse High • Do something when a wire is 1, usually relative to a clock edge • Pulse Change • Do something when a signal is different than on the last cycle • Time • Do something a certain amount of time after a previous event • Measured with a clock in synchronous systems • Possible to build “delay lines” using transistors and gates EECS150 Lecture #23

  14. Synchronization (1 of 4) • Clocking • 1 Clock • Fully synchronous, no need to worry about the issue • May have multiple resets • E.g. hold video in reset until SDRAM is ready • Can get pretty complex (e.g. CPU & JTag) • 2 Clocks • Clock Crossing, easy to keep straight • Often use Async FIFOs and dual port RAMs on FPGAs • These are expensive in ASICs, use synchronizers • Obviously multiple resets • Local Clocks & LocalResetGen • Often restricted to use in an interface (e.g. interchip) • May not be free-running • Often require careful design to avoid issues EECS150 Lecture #23

  15. Synchronization (2 of 4) • Reset • 1 Clock, no initialization • Multistage Initialization • Reset for one module depends on state of another • Using the ButtonParser is an example of this • 2 Clocks • Usually reset is synchronous to one clock • May need a shift register to resynchronize reset • Self starting • Useful for generating a reset for the rest of the system • Any device which “just works” on power-up has one • Can be built on FPGA by using a shift register with an initial value • Local Resets & LocalResetGen • Reset logic can affect clocking & reliability • May be requirements like holding reset for some time EECS150 Lecture #23

  16. Synchronization (3 of 4) EECS150 Lecture #23

  17. Synchronization (4 of 4) EECS150 Lecture #23

  18. Handshaking (1 of 4) • So What? • When things happen is vital • Hardware modules must cooperate in order to be useful • Planning out all interaction timings on the drawing board is best, but often hopeless • Handshakes • Pipelined (None) • 2 & 4 Cycle (Self-timed) • Ready/Valid (Synchronous) EECS150 Lecture #23

  19. Handshaking (2 of 4) • 4 Cycle • RTZ: Return to Zero • Fewer transistors • Easier to debug • 2 Cycle • More transistors • Not really faster • NRTZ: Non-RTZ • Can be synchronous • GasP • RTZ handshaking • Carefully delay matched circuits • No clock! EECS150 Lecture #23

  20. Handshaking (3 of 4) • Ready/Valid • Independent • Avoid combinational loops • Simplifies generation and checking • Symmetric • Composable • Allows the pass-through • Coregen FIFOs asymmetric • Latency Insensitive • Allows modules to run at their own pace • Trades cost to do this!! • Send/Accept • Same signals, new names! • Why? Read on…. EECS150 Lecture #23

  21. Handshaking (4 of 4) • Composition Failure • Arbiter chooses one of two inputs • Router chooses one of two outputs • Read0 & Valid1 • Any time two modules are connected by two paths… • Classes • Class1: No dependencies • Class2: Dependencies between ports • Class3: Dependencies within ports EECS150 Lecture #23

  22. Protocols (1 of 5) • So What? • Know the data isn’t enough, we need meaning • Just like language we build representations of meaning • Knowing the patterns to meaning, allows us to abstract it • Structure • Parallel: all the bits at once • Counted: there are a fixed number of words, we count them off • Framed: adding a higher level handshake allows variable length • Syntax • How the data fits together • We’ll cover this more in the next few slides • Sematics • What the data means • Highly dependent on the interface in question • Terms: The Band • In Band: the data we’re trying to move • Out of Band: control, metadata and other issues EECS150 Lecture #23

  23. Protocols (2 of 5) • Dataflow Based • Audio, video, instructions in a CPU • Generally when there’s little (no) OOB data • Usually parallel or counted for simplicity • Benefits • Excellent handling of LTI or independent data values • Simple production and consumption • Little or no state, e.g. a valid bit is all you need • Allows construction of specialized hardware (DSP designs for example) • Drawbacks • Very difficult, if not impossible to deal with exceptions • For playing audio: what if you need data but it’s not there? • When things fail there’s often nothing you can do EECS150 Lecture #23

  24. Protocols (3 of 5) • Command Based • Useful for low bandwidth peripherals • Organized according to master/slave • E.g. draw a line, write a word to memory • Benefits • Very easy to build new slaves • Clear demarcation of responsibility (Good for CPUs) • Generally very easy to expand, just add new commands • Drawbacks • Tends to be very low performance • Overhead to specify command • No parallelism • Usually requires some polling (interrupts are poll based) • Requires master to know state at all times EECS150 Lecture #23

  25. Protocols (4 of 5) • Register Based • Stateful peripherals with lots of config • Organized according to master/slave • Often used alongside a dataflow interface • Benefits • Provides a memory-like abstraction • Allows the master to read state easily • Easy to deal with exceptional conditions (error flag) • Drawbacks • Medium performance • Overhead to specify read/write and register address • DMA can help with this • Requires a clear master, often meaning an FSM/CPU EECS150 Lecture #23

  26. Protocols (5 of 5) • Layering • Uncommon to have one syntax • They are easy to layer • Dataflow on top of command • Each command can be a “write <data>” • Not entirely efficient, but gets the job done • This is how software FIFOs and networks work • Register on top of command • Two commands: read & write • Relatively common, allows command wires to be shared • This is how most memories, especially DRAMs work • Command on top of register • Writing a certain value to a register indicates the command • Perhaps a series of writes to registers • Many CPU peripherals do this EECS150 Lecture #23

  27. Simple Interfaces (1 of 4) • So What? • Uses few wires • No tristates • Synchronous • SPI • Signals: SO, SI, CS, CLK • Uses: CC2420, ADC • Bit Serial • Bidirectional • Often used with register syntax EECS150 Lecture #23

  28. Simple Interfaces (2 of 4) • So What? • Fewest pins (almost) • Control, not data • Long distance • I2C • Uses two wires • Master/Slave • Includes handshake • Bit Serial • Bidirectional • Often used with register syntax EECS150 Lecture #23

  29. Simple Interfaces (3 of 4) • History • In IBM PCs • RS232 and RS485 • Still widely used • Simple/cheap • Noise resistant • Problems • Low bandwidth • Limited by internal timing clocks • Very low level protocol • So What? • Very few pins (3) • No clock required • Long distance • UART • Bit serial • No clock signal • Good & Bad • Relies on timing for events • Often used with dataflow syntax EECS150 Lecture #23

  30. Simple Interfaces (4 of 4) • So What? • N64 Controllers • Used in projects • N64 • Asynchronous • More robust than UART • Command Syntax • Main: Reset & Read Buttons • Other: Status, Mempack, EEPROM • Receiving a bit: • Look for 1’b1 (Stop) -> 1’b0 (Start) • Wait 1us (why 1us?!?) • Capture Data EECS150 Lecture #23

  31. Intermediate Interfaces (1 of 4) • So What? • HD44780, standard • 4 or 8b operation • Interesting timing • LCD • Interface • LCD_DB[7:0]: Data • LCD_RS: Registe select • LCD_RW: Read/Write • LCD_E • Enable/Strobe • Provides timing EECS150 Lecture #23

  32. Intermediate Interfaces (2 of 4) • So What? • Used everywhere • Framed structure • Dataflow syntax • 10M-1G Ethernet • Bit Serial Link • 4/5bit Encoding takes 20% overhead • Bit5 is used for Data-Valid and Error • Preamble used for clock extraction • Inter Frame Gap ensures packets aren’t back-to-back • CRC used to avoid errors from transmission EECS150 Lecture #23

  33. Intermediate Interfaces (3 of 4) • 10M-1G Ethernet • Receive • Wait for DataValid & SFD • Start shifting/FIFOing data • Wait for DataValid to go low • Check CRC, discard/mark packet • Transmit is similar • CRC • An LFSR based code • Appended to the end of each frame • Used to ensure nothing is corrupted EECS150 Lecture #23

  34. Intermediate Interfaces (4 of 4) • So What? • Source Synchronous • Very high bandwidth • 966Mbps per pair • Interchip • Dataflow structure • Send clock alongside data • Requires async FIFO • Differential pairs require special signaling for this EECS150 Lecture #23

  35. CPU Interfaces (1 of 3) • So What? • Allow CPU to control peripherals • Old: Simplicity of I/O devices (no FPGAs back in the day) • New: Bandwidth (audio & video) • Key Assumptions • CPU is in control • Separation of data (high bandwidth) and control (very low latency) • Basic Organization • Historically “bus” based • Single arbiter, or even single master • Most devices are simple and respond only • Memory/register centric (e.g. read/write ops) • Newer point to point designs • PCIe, HyperTransport • Based on command packets (e.g. read/write ops) EECS150 Lecture #23

  36. CPU Interfaces (2 of 3) • So What? • Very widespread standard • Simple enough to describe here • ISA • Synchronous bus • Assumes 1 cycle access • 8MHz standard • Basic Operations • Address (CPU -> IO) • Control (CPU -> IO) • Data (CPU <-> IO) • Extensions • DMA • Interrupts • History • IBM PC XT • 8b and then 16b • PnP Added Later • Open Standard EECS150 Lecture #23

  37. CPU Interfaces (3 of 3) • So What? • Higher bandwidth than old parallel busses • Overcomes pin limitations • Separates physical and logical transport to allow more complex analog design • PCIe • Based on bit-serial lanes • Very high bandwidth • Channel bonding, similar to 10Gbps Ethernet • Point to Point • Packet/Switch Based • High overhead for small messages (interrupts) • Layers • Physical • Data Link (ack/nak) • Transactions (memory/int) • History • Developed by Intel • 2.5 GTps, 5GTps … EECS150 Lecture #23

  38. Design (1 of 3) • So What? • Well, you’ve been designing some interfaces • You will keep using them • Similar principles apply to hardware and software • Back to Principles • What do you want from the interface (SHOULD) • What do you need from the interface (MUST) EECS150 Lecture #23

  39. Design (2 of 3) • Reuse & Standardization • May introduce overhead • Leverage well tested modules • Eases debugging & documentation • Modeling, Verification & Debugging • Requires two implementations • E.g. transmitter & receiver • Automated testing • Allows you to quickly verify any changes • Greatly simplifies life for someone else EECS150 Lecture #23

  40. Design (3 of 3) • Good Interfaces • Simplify the interacting modules • Both the design and implementation • Simplify doesn’t always mean “making smaller” • Are self-documenting • Are naturally widely applicable • Bad Interfaces • Are complex, or hard to debug • Are expensive to design and implement • Make incorrect assumptions • Do more work than necessary • Eliminating timing assumptions, when we know the timing • Otherwise checking invariants we know to be true EECS150 Lecture #23

  41. A Case Study (1 of 2) • The RAMP DRAM Interface • What MUST we do • Convey address to the controller • Convey data in both directions • Support handshaking to deal with variable latency in controller • What should we do • Allow multiple users to share DRAM • Support extremely high bandwidth • The Design • 3 FIFOs with Ready/Valid • Command: read/write and address to controller • DataIn: data to be written (and mask) • DataOut: data which was read (and any error counts for ECC) EECS150 Lecture #23

  42. A Case Study (2 of 2) • Metrics • Bandwidth: maximized by using wide data FIFOs • Latency: minimized by avoiding any serialization • Pint Count: dictated by need for maximum bandwidth • Complexity: low thanks to ready/valid • Datapath & Control • All 3 FIFOs are datapath • Separate initialization & power state for control • Clocking: Each FIFO can have a separate clock • Handshaking is Ready/Valid • Protocol • Low level: dataflow • Intermediate level: commands • High level: register EECS150 Lecture #23

  43. Summary (1 of 2) • Any useful system includes at least two interfaces: input and output • The most difficult work in any system is matching incompatible interfaces • Principles • Metrics: Bandwidth, Latency, Pin Count & Logic Overhead • Datapath & Control (States & Events) • Synchronization: Clock & Reset • Handshaking (Ready/Valid) • Protocols (structure, syntax, sematics) • Design • Back to principles • Reuse & Standardization • Modeling, Verification & Debugging EECS150 Lecture #23

  44. Summary (2 of 2) • Interfaces • Simple Interfaces • SPI, I2C, UART, N64 • JTag, Slave Serial, MDI (Ethernet) • Intermediate Interfaces • SDRAM, Audio, LCD, Ethernet (10M-10G), Interchip • CC2420, Video Encoder/Decoder • CPU Interfaces • ISA, PCIe • MCA, PCI, PCI-X, HyperTransport, Intel FSB, AGP, AMBA EECS150 Lecture #23

More Related