200 likes | 314 Views
Architectural Complexity: Opening the Black Box. Methods for Exposing Internal Functionality of Complex Single and Multiple Processor Systems. EECC-756. Modern Design Trends. Larger on-chip caches Extended levels of cache System-on-a-chip integration Overall increasing design complexity
E N D
Architectural Complexity: Opening the Black Box Methods for Exposing Internal Functionality of Complex Single and Multiple Processor Systems EECC-756
Modern Design Trends • Larger on-chip caches • Extended levels of cache • System-on-a-chip integration • Overall increasing design complexity All lead to more complex debugging of designs
The Good News • Automated design tools are minimizing design errors • IP reuse minimizes bugs • Simulation tools discover most logic errors before fabrication • Massive test suites allow comprehensive testing • So what happened to Intel with FPU flaw?
Past Methods for Debugging • Signal probing • Bus monitoring • Software debugging
Past Methods for Debugging (cont’d) • Signal probing • More internal logic per pin = less info on pin • Pin inaccessibility due to modern packages (i.e. sockets, BGAs) • Bus monitoring • Caches hide data accesses • Software debugging • Impractical for real-time applications • Little or no hardware support in the past
Solutions • Test Access Port (TAP) • Uses JTAG IEEE1149.1 specification for boundary scan • Probe Mode • Allows step by step analysis of code impact on internal registers • In-circuit Emulation (ICE) • Allows execution tracing • Real-time applicability
Test Access Port (TAP) • Implementation of boundary scan JTAG IEEE1149.1 specification • Allows access to all internal flip-flops in boundary scan chain • Numerous chains serve different functions (i.e. IO flip-flops) • Allows non-destructive snapshot of internal state at any point in time
Test Access Port (cont’d) • Single instruction register • Multiple data registers (scan chains)
Probe Mode • Special processor mode halts program execution • Uses the TAP interface to receive instructions and output internal data • Allows read/write access to any internal registers • Allows memory accesses to test cache functionality
In-Circuit Emulation (ICE) Support • Special pins provide branching information • Example: Pentium Dual Pipeline • 3 dedicated pins • IU – Asserted when instruction completes in the U instruction pipeline • IV – Asserted when instruction completes in the V instruction pipeline • IBT – (Instruction Branch Taken) Asserted when a branch is taken
In-Circuit Emulation (cont’d) • Branch signal information provides realtime code tracing • Branch trace message buffers provide further information • Branch trace message buffers in conjunction with Probe Mode allow detailed realtime code tracing
Branch Trace Message Buffers • FIFO queue • Can be read through TAP during program execution • Circular mode (trace-back from breakpoint) vs. Jump-to-Probe Mode (maintain instruction stream) • Incident counter expands buffer size • Intel automatically generates a special BTM cycle on local bus to export BTM info
Multiprocessor Issues • Three methods for opening the “black box” on a single processor system • TAP (boundary scan) • Probe Mode • Branch Tracing Methods for ICE • Multiple processor system design also has challenges
Multiprocessor Challenges • Race conditions due to parallel data accesses • Inconsistent and unpredictable network paths • Differing processor behaviors on heterogeneous networks • Communication patterns that restrict performance or scalability
Multiprocessor Solutions : Debugging Code • Create sequential version of code • Execute parallel tasks on a single computer as separate processes • Visualization tools that create space-time diagrams or animations to show 2-dimensional changes of state • Unified Trace Environment (IBM)
Multiprocessor Solutions : Debugging Designs • Ability to monitor communication packets circumvents most visibility problems • Debug messages can be included in packet • Network protocol simulations • Protocol verification programs • (i.e. petri-nets) • Network communication pattern simulators • However ...
Multiprocessor Design Trends • Currently, uniprocessor designs are hitting roadblocks • large dies impractical signal transit time • routing increases exponentially with die size • One possible solution : multiple processors on a single die re-emergence of visibility problems
Conclusion • Several methods available for internal execution tracing of uniprocessors • Test Access Port (JTAG IEEE1149.1) • Probe Mode extension • Branch Tracing • Don’t count out TAP, Probe Mode, and ICE for multiprocessors