260 likes | 376 Views
Effects of EMI on Digital Systems. Participants: Prof. P. Mazumder The University of Michigan Prof. M. Bridgwood Clemson University Prof. S. Dutt
E N D
Effects of EMI on Digital Systems Participants: Prof. P. Mazumder The University of Michigan Prof. M. Bridgwood Clemson University Prof. S. Dutt University of Illinois, Chicago MURI-01 Kick-off Meeting, June 13, 2001
Task 3 Effects of EMI on Digital Systems • ESD Protection: How good • is it for suppressing EMI • (Task 3.1: Bridgwood) • How simultaneous switching • of million devices causes EMI • and reduces chip operating • margin and reliability • How external EM pulses can • further aggravate chip • margins and reliability • (Task 3.2: Mazumder) • Characterization of functional • and behavioral failures • attributed to EMI • (Task 3.3: Dutt)
Task 3.1 Conducted WB and NB Interaction with Digital Devices • Impedances of Interfaces through Switching - Inputs, Outputs, Control Lines, Power supplies • Vulnerability Thresholds - Disruption and Damage - Single Nodes - Multiple Nodes and Devices • Waveforms - Pulses - Damped Rings Clemson University Clemson University
Task 3.1 DRAM Input Circuit Structure Clemson University Clemson University
Task 3.1 Interaction With Devices – Injection System Clemson University Clemson University
Task 3.1 Modeling of Simple Capacitive StructureThrough Breakdown Events Clemson University
Task 3.2 SIA Roadmap - IC Technology • Vdd scaling substantially reduces opearting and noise margins • Freq: ~ 2-20 GHz, Dissipation: 100-200 W, EM emissions, P/G noises • On-chip wiring: 3-25 km; sources of parasitics, noises, delays • No. Trans: 200 M to 1.7 B; 85%-95% of this area will be occupied by • Random-Access Memories, CAMs and ROMs. University of Michigan
Task 3.2 EMI Impacts on Digital Circuits • Task 3.2.1:EMI generation due to Tr. Switching • Task 3.2.2:Effects of EMI on chip operations • Task 3.2.3: EMI Simulator Design University of Michigan
Task 3.2 Noise Distribution Paths • Direct radiation from chip surface • Caused by high-frequency current within the chip • Level of radiation is small in comparison to the following ones • Conducting noise from the signal ports • Off-chip wires act as antennae • Effect of this noise source is significant … but it’s an easy problem • Power-line conducting noise • High-frequency large power/ground current • Most significant source of EMI problem … and is difficult to solve University of Michigan
Task 3.2 Power-Line Conducting Noise • Modeling core power network • Power-line capacitance modeling • Switching current model University of Michigan
Task 3.2 Power Network Modeling University of Michigan
Task 3.2 Switching Current Simulation • Time-varying switching current consumed in circuit blocks is first simulated assuming ideal power supply voltage using SPICE • Circuit simulation is performed for the power network with the switching current information added • By iterating this annotation process, one can achieve better simulation accuracy University of Michigan
Task 3.2 Power-Line Conducting Noise UofM NDR-Group is the developer of Quantum SPICE simulator and we have extensive knowledge in SPICE. Use UofM RAM compiler to generate Memory Array and estimate Switching Noise and Noise Spectrum. Study Failure Modes for Read and Write operations, with and without external EMI noises. University of Michigan
BisramGen, a Self-testable and Self- repairable RAM Compiler Designed at Univ. of Michigan
Task 3.2 Clock Network • Transmission line modeling of clock wires • Differential Quadrature Method (DQM) • Model Reduction by Krylov Subspace Method • Study of clock jitters and synchronization failures due to ringing and deformities • FD-TLM based VEDICS Tool (designed at Univ. of Michigan) • More accurate than lumped model yet more efficient than other field solvers University of Michigan
Task 3.2 EMI Simulator • EMI simulation requires SPICE-like simulators • Give accurate results • Orders of magnitude faster than full-wave simulator • We will • incorporate new transmission line modeling methods • incorporate new device model including parasitic devices • combine FD-TLM based simulator with SPICE University of Michigan
Task 3.2 System Level Studies for Estimating EMI Effects • Subcircuit level: • Simple gates (Inverter, NAND, NOR, XOR, MUX, etc.) • Logic families (Static CMOS, Domino, DCVS, etc.) • Subsystem level (chips are already designed at UM): • 16-bit ALU • 16x16 multiplier • 4Kx32x32 (4 Mb) RAM SRAM memory • System level: • Power/ground network • Clock distribution network • 32-bit RISC microprocessor (such as DEC Alpha, Pentium) University of Michigan
Task 3.3 Funded and Past Work • Recently-funded work on FT (S. Dutt) [Verma, MS Thesis, UIC,’01], [Verma & Dutt, ICCAD’01 subm.] [Dutt, et al., ICCAD’99], [Mahapatra & Dutt, FTCS’99]-- Funded in part by DARPA-ACS, Xilinx Inc.: • On-line test and fault reconfiguration of field-programmable gate arrays using a roving tester • Key is effective incremental re-placement and re-routing to dynamically move the roving tester • EM-induced faults: • High level computer failure detection due to different types of EM signals [Mojert et al., EMC’01]; no cause-effect or classification analysis. • Failure in real-time communication & control systems from communication line errors due to EM signals [Kohlberg & Carter, EMC’01] University of Illinois
Task 3.3 Assumptions/Scenarios of Past Work • Past Work on general fault detection: • Faults directly affect transistors & on-chip interconnects • Random single (sometimes double) faults • Deterministic faults • Types of faults: permanent, transient, intermittent; intermittent not generally tackled • Past Work on EM-induced faults: • No how/why/what analysis and classification of computer failure due to EM interference University of Illinois
Task 3.3 Different Scenarios in Proposed Work • Faults directly affects off-chip signal lines (memory address, data and control lines) and power/ground (p/g) lines • p/g line faults => multiple faults (clustered if p/g lines are partitioned, else random) • Signal line faults => incorrect instr./data => multiple clustered faults along control/data path • Window of susceptibility if p/g lines shielded -- probabilistic model (e.g., susceptible on cache misses) • May need to tackle intermittent faults due to periodic EM pulses • Detailed error analysis and classification due to EM-induced faults University of Illinois
0 1 Signal line Select line Var-width Var-period pulse gen. MUX Signal line To other fault injectors Task 3.3 Proposed Work • Comprehensive VHDL processor and memory model • Will include variable-width variable-period fault injection capability for off-chip signal lines (to simulate different pulse widths and periods). • Similar fault-injection capability for on-chip wires with a probabilistic component University of Illinois
Task 3.3 Proposed Work (contd.) • Will determine and classify the following type of computer system behavioral error (i.e., program errors) due to different patterns, extent, duration and location of faults: • Control flow errors -- incorrect sequence of instruction execution. Causes; address gen. error, memory faults, bus faults • Data errors. Causes: computation errors, memory & bus faults • Hung processor & crashes. Causes: C.U. transition to dead-end states, invalid instruction, out-of-bound address, divide-by-zero, spurious interrupts (?) • To the best of our knowledge, more comprehensive analysis of fault effects on a computer system than that attempted previously • Comprehensive analysis is needed due to the nature of EM effects--all pervasive, periodic, clustered University of Illinois
Memory Hierarchy Watchdog ADD r1 r3 LD r2 address BLT r4 r8 off Memory Bus v1 Signal from branch circuit MAIN v2 v3 Processor v4 Sign(v4) BRT v6 v5 WD v6 Proposed Work--Methodologies: Control Flow Checking [Mahmood & McCluskey, TC’88] • A node is a block of instructions with a branch at the end • A derived signature of a node is a function (e.g., xor, LFSR) of all its instructions • A program graph is one in which there is an arc from node u to v if the branch at u can lead to node v
Proposed Work--Methodologies: Algorithm-Based Fault Tolerance[Huang & Abraham, TC’84], [Dutt & Assad, TC‘96] • Use properties of the computation to check correctness of computed data • E.g., linearity property: f(v1+v2) = f(v1) + f(v2), of computation f( ) can be used to check it: • Pre-compute v’ = v1 + v2 + … + vk (input checksum) • Compute f(v1), …., f(vk) • Compute u = f(v) + f(v2) + …. + f(vk) (output checksum) • Check if f(v’) = u; inequality indicates computation error(s) • Can be used for linear computations such as matrix multiplication, matrix addition, Gaussian elimination [Huang & Abraham, TC’84], [Dutt & Assad, TC‘96] University of Illinois
Task 3.3 Goals, Questions & Future Outlook • Correlate the probability/frequency of different types of computer system errors to [pattern, extent, duration, location] of EM-induced faults • Correlate types of logic faults w/ similar descriptors to functional errors (output error of ALU, Control Unit) -- classification of catastrophic vs. non-catastrophic logic faults • Q: Are there patterns of errors that lead to computer crashes w/ high probability? • Q: If so, can the detection of such patterns be used to shut down the computer in a fail-safe manner (save state & data for later resumption)? University of Illinois
Task 3.3 Goals, Questions & Future Outlook (contd.) • Q: Are there patterns of errors that are characteristic of EM-induced faults versus random single/double faults? • Q: If so, can these be used as “early detection & warning” of EM interference? • Future: Based on the correlation of system errors to EM faults, determine fault tolerance/error minimization techniques for EM-induced faults University of Illinois