100 likes | 216 Views
Faults and Uncertainty – Do we need a Totally New Approach?. Lou Scheffer. Many causes of faults and uncertainty. The obvious DSM physics, plus Is the spec solid and what the market wants? Did I implement it correctly (bugs)? Manufacturing uncertainty How much will this chip cost?
E N D
Faults and Uncertainty – Do we need a Totally New Approach? Lou Scheffer
Many causes of faults and uncertainty • The obvious DSM physics, plus • Is the spec solid and what the market wants? • Did I implement it correctly (bugs)? • Manufacturing uncertainty • How much will this chip cost? • How many mask iterations to an acceptable chip? • What’s my yield? • Even if works, will the completed chip be reliable?
What’s the cause of this uncertainty? • Time to Market Concerns • Specs/needs open to change • (DVD-RW, DVD+RW, DVD-R, etc.) • Logic verification not as complete as desired • Physics of DSM devices and manufacturing • Every gate is slightly different (environment and mfg) • Some gates may not work at all (yield) • Some gate may work, but slowly (delay faults) • Some gates may fail occasionally (SEU – Single Event Upset)
What can synthesis/logic do about spec uncertainty? • Allow for uncertain logic (specs and bugs) • FPGAs for functions likely to change • Easy switch (with cost estimation) from hardware to FPGA to software and back • Plan for minimal mask ECOs • Why? Some estimates are $10M/mask set at 50 nm • Requires a specialized incremental re-synthesis • Also requires physical tool changes - Filler cells with transistors, specialized router options, etc. • This is a subset of the general “DSM Masks are expensive” problem
What can synthesis/logic do about process variability? • Adopt statistical timing • Accurately account for uncertainty • Reduce un-necessary pessimism • Estimate yield effects of timing • Unlikely (in my opinion) to result in substantially different logic implementations • Asynchronous design • Very hard for a number of reasons, including designer’s mental models • Would require huge synthesis changes if accepted
What can synthesis/logic do about variability? • Determine actual performance at manufacturing • Non-trivial since each gate/path may vary in timing • Only possible for some errors – hold errors spell doom, but setup errors just yield slower parts • Built in self test that can run at speed • Scan logic that can test for delay faults • May need modifications to test/scan insertion • Treat these as first-class paths
What can synthesis/logic do about faults? • Reconfiguration/repair at manufacturing • Laser or fuse reconfiguration (only memories now) • Self test reconfiguration at power-up • Synthesis could help support this • Designer indicates which units need redundancy • Synthesis tool does the legwork/bookkeeping
What can synthesis/logic do about faults? • Transient error handling (Single Event Upset) • Considerable experience in aerospace • Synthesis could do a lot here, if needed • Triplicated state elements • Error correcting codes (now used for memories, can be extended to logic) • Algorithm based fault tolerance (for high level primitives) • Check and restart pipeline (mostly CPUs)
So, do we need a totally new approach? • In the short term, no • FPGA synthesis • Minimalist masks and ECOs • Reconfiguration at manufacturing/startup time • At speed on-chip testing • SEU handling • Statistical timing • All are more or less incremental improvements
Do we need a totally new approach? • In the long term, answer is still no • Limit to reducing uncertainty is the spec • And the limit to this is human understanding • This is exactly the programming problem • Programming languages are the best known way to specify the intended behavior of large systems • So until we find a way to do programming better, synthesis tools will look more or less like compilers, as they do today