560 likes | 682 Views
BackSpace: Formal Analysis for Post-Silicon Debug. Flavio M. de Paula * Marcel Gort * , Alan J. Hu * , Steve Wilton * , Jin Yang + * University of British Columbia + Intel Corporation. Outline. Motivation Current Practices BackSpace – The Intuition
E N D
BackSpace: Formal Analysis for Post-Silicon Debug Flavio M. de Paula* Marcel Gort *, Alan J. Hu *, Steve Wilton *, Jin Yang+ * University of British Columbia + Intel Corporation
Outline • Motivation • Current Practices • BackSpace – The Intuition • Proof-of-Concept Experimental Results • (Recent Experiments) • Conclusions and Future Work
Motivation • Chip is back from fab! • Screened out chips w/ manufacturing defects
Motivation • Chip is back from fab! • Screened out chips w/ manufacturing defects • A bring-up procedure follows: • Run diagnostics w/o problems, everything looks fine!
Motivation • Chip is back from fab! • Screened out chips w/ manufacturing defects • A bring-up procedure follows: • Run diagnostics w/o problems, everything looks fine! • But, the system becomes irresponsive while running the real application…
Motivation • Chip is back from fab! • Screened out chips w/ manufacturing defects • A bring-up procedure follows: • Run diagnostics w/o problems, everything looks fine! • But, the system becomes irresponsive while running the real application… • Every single chip fails in the same way (1M DPM: Func. bugs)
Motivation • Chip is back from fab! • Screened out chips w/ manufacturing defects • A bring-up procedure follows: • Run diagnostics w/o problems, everything looks fine! • But, the system becomes irresponsive while running the real application… • Every single chip fails in the same way (1M DPM: Func. bugs) • What do we do now?
Current Practices Inputs Scan-out buggy state
Current Practices Inputs Scan-out buggy state But, cause is not obvious!!!
Current Practices Guess when to stop and single step Inputs ? ? ? Scan-out
Current Practices Guess when to stop and single step Inputs ? Problems: Single-stepping interference; Non-determinism; Too early/late to stop? Non-buggy path
Current Practices • Leveraging additional debugging support: • Trace buffer of the internal state
Current Practices • Leveraging additional debugging support: • Trace buffer of the internal state • Provides only a narrow view of the design, e.g., program counter, address/data fetches
Current Practices • Leveraging additional debugging support: • Trace buffer of the internal state • Provides only a narrow view of the design, e.g., program counter, address/data fetches • Record all I/O and replay • Solves the non-determinism problem, but… • Requires highly specialized bring-up systems
Current Practices • Leveraging additional debugging support: • Trace buffer of the internal state • Provides only a narrow view of the design, e.g., program counter, address/data fetches • Record all I/O and replay • Solves the non-determinism problem, but… • Requires highly specialized bring-up systems • Just having additional hardware • does NOT solve the problem
A Better Solution: BackSpace • Goal: • Avoid guess work • Avoid interfering with the system • Run at speed • Portable debug support • Compute an accurate trace to the bug
A Better Solution: BackSpace • Requires: • Hardware: • Existing test infrastructure and scan-chains; • Breakpoint circuit; • Good signature scheme; • Software: • Efficient SAT solver; • BackSpace Manager
A Better Solution: BackSpace Inputs 1. Run at-speed until hit the buggy state Non-buggy path
A Better Solution: BackSpace Inputs 1. Run at-speed until hit the buggy state Non-buggy path
A Better Solution: BackSpace Inputs 1. Run at-speed until hit the buggy state Non-buggy path
A Better Solution: BackSpace Inputs 1. Run at-speed until hit the buggy state Non-buggy path
A Better Solution: BackSpace Inputs 2. Scan-out buggy state and history of signatures
A Better Solution: BackSpace Inputs Off-Chip Formal Analysis Formal Engine
A Better Solution: BackSpace Inputs • Off-Chip Formal Analysis • - Compute Pre-image Formal Engine
A Better Solution: BackSpace Inputs Pick candidate state and load breakpoint circuit Formal Engine
A Better Solution: BackSpace Inputs Run until hits the breakpoint Formal Engine
A Better Solution: BackSpace Inputs Pick another state Formal Engine
A Better Solution: BackSpace Inputs Run until hits the breakpoint Formal Engine
A Better Solution: BackSpace Inputs Run until hits the breakpoint Formal Engine
A Better Solution: BackSpace Inputs Computed trace of length 2
A Better Solution: BackSpace Inputs Iterate Formal Engine
A Better Solution: BackSpace Inputs BackSpace trace
Outline • Motivation • Current Practices • BackSpace – The Intuition • Proof-of-Concept Experimental Results • Recent Experiments • Future Work
Proof-of-Concept Experimental Results Chip on Silicon BackSpace Manager SAT Solver
Proof-of-Concept Experimental Results Logic Simulator BackSpace Manager SAT Solver
Proof-of-Concept Experimental Results • Setup: • OpenCores’ designs: • 68HC05: 109 latches • oc8051 : 702 latches • Run real applications
Proof-of-Concept Experimental Results • Can we find a signature that reduces the size of the pre-image? • Experiment: • Select 10 arbitrary ‘crash’ states on 68HC05; • Try different signatures
Proof-of-Concept Experimental Results • How far can we go back? • Experiment: • Select arbitrary ‘crash’ states: • 10 for each 68HC05 and oc8051; • Set limit to 500 cycles of backspace; • Set limit on size of pre-image to 300 states; • Compare the best two types of signature; • Hand-picked • Universal Hashing of entire state
Proof-of-Concept Experimental Results • Results • Signature: Universal Hashing • Small size of pre-images • All 20 cases successfullyBackSpaced to limit
Proof-of-Concept Experimental Results • Breakpoint Circuitry • 40-50% area overhead. • Signature Computation • Universal Hashing naïve implementation results in 150% area overhead.
Recent Experiments • OpenRisc 1200: • 32-bit RISC processor; • Harvard micro-architecture; • 5-stage integer pipeline; • Virtual memory support; • Total of 3k+ latches • BackSpace implemented in HW/SW • AMIRIX AP1000 FPGA board (provided by CMC) • Board mimics bring-up systems • Host-PC: off-chip formal analysis
Recent Experiments • BackSpacing OpenRisc 1200: • Running simple software application • Backspaced for hundreds of cycles • Demonstrated robustness in the presence of nondeterminism
Conclusions & Future Work • Introduced BackSpace: a new paradigm for post-silicon debug • Demonstrated it works • Main challenges: • Find hardware-friendly & SAT-friendly signatures • Minimize breakpoint circuitry overhead