370 likes | 586 Views
Using a Formal Specification and a Model Checker to Monitor and Guide Simulation Verifying the Multiprocessing Hardware of the Alpha 21364 Microprocessor. Serdar Tasiran
E N D
Using a Formal Specification and a Model Checker to Monitor and Guide SimulationVerifying the Multiprocessing Hardware of the Alpha 21364 Microprocessor Serdar Tasiran Koç University, Istanbul, Turkey(formerly Systems Research Center, Compaq/HP) Yuan Yu(Microsoft Research, formerly Compaq)Brannon Batson(Intel, formerly Compaq)
The Problem • Given • a formal specification: • An algorithm-level, executable description • an implementation: • Hardware described at RT level • Verify that • All executions of the implementation are consistent with the specification • An implementation verification problem • Design verification handled separately: • Verifying properties of the specification (absence of deadlock, …)
Earlier work:Validation Guided by Coverage Model Checker Coverageanalysis Monitors, reference model RT LevelDesign Inputs Simulator
Validation Guided by Formal Spec Coverage • Checks if spec is violated • Collects coverage data • Generates traces to coverage targets Model Checker Algorithm level Formal Spec Abstraction map RT LevelDesign Inputs Simulator
Validation Guided by Formal Spec Coverage Model Checker Algorithm level Formal Spec Abstraction map RT LevelDesign Inputs Simulator
0 M 1 M 2 M 3 M IO IO IO IO 4 M 5 M 6 M 7 M IO IO IO IO 8 M 9 M 10 M 11 M IO IO IO IO Alpha 21364 Multiprocessor System Block Diagram • Distributed shared memory (seamless SMP) • Up to 256 processors, 32 GB per processor • Each processor “owns” portion of memory • Responsible for consistency of memory it owns: Cache coherence
Closer look: Chip Block Diagram Routingprotocolengine L2 Cachecontroller IO R EV7 core & cache system data buffers C Z0 Z1 Memorycontrollers mem mem
EV7 Cache Coherence: R EV7 core & cache system data buffers C • Hardware implementationof multiprocessing engine: ~20K lines of HDL code SVDB, FB0, FB1 Z0 Z1 mem mem
Why is the problem difficult? • Beyond the reach of automatic formal methods • Complex hardware, architecture : • Thousands of state variables per processor • Parallelism, several deep pipelines, speculation, redundancy • Complex system configuration: • Need several processors to exercise certain scenarios • Decomposition methods difficult for non-specialists, large design teams • Complete verification of hardware against protocol not practical • Simulation only viable approach • Even simulation is expensive • Must make judicious use of simulation resources
Validation Guided by Formal Spec Coverage Model Checker Algorithm level Formal Spec Abstraction map RT LevelDesign Inputs Simulator
0 M 1 M 2 M 3 M IO IO IO IO 4 M 5 M 6 M 7 M IO IO IO IO 8 M 9 M 10 M 11 M IO IO IO IO The Spec: EV7 Cache Coherence Protocol • Distributed shared memory • Each address belongs to a “home node” • but may be in other caches • Directory-based protocol • Cache states:Modified (Dirty), Exclusive (Clean), Shared, Invalid • Directory states: Local, Shared, Exclusive, Incoherent • Directory distributed, stored in memory at each node with data • CPU requests that miss in local caches are sent to home node • Home node may forward request to other nodes • Directory In Flight Table (DIFT) keeps track of pending requests
Formal Specification of Protocol • Spec written in Temporal Logic of Actions [Leslie Lamport] • TLA: Formal language for writing high-level, executable specs of concurrent, reactive systems • Very expressive. Incorporates • first-order logic, set theory, temporal operators • sets, queues, records, tuples, … • Written by architects • Some help from verification researchers • Started from text documents at the same level of abstraction • Spec is a TLA formula, around 2000 lines, 60 pages
addr0 addr0 addr0 Formal Specification of Protocol • Architecture encapsulated in TLA spec at algorithm level • Spec state variables correspond to • Contents of major data structures • E.g. DIFT linked list of transactions per address • Messages in flight • Protocol transactions described by “TLA actions” DIFT[addr0] Directorystate Cachestate Command Response SharedtoDirty {0,1,2} Shared S2DSuccess SharedtoDirty Exclusive Evicting S2DFailure Invalid SharedtoDirty
ReadMod SharedInv SharedInv R SharedInv BlkExclusiveCnt(3) Action name … Macro definitions … … Preconditions Messages sent State variable updates S1 S2 S3 H Tell the requestor how many “Invalidate Acknowledge” messages to wait for before modifying the line The request is a “Read Modify” Update the directory state The caches and the victim buffer do not have a recent copy Free the memory controller state associated with this transaction Send “Invalidate” messagesto the sharers The block is in the“Shared” directory state
Validation Guided by Formal Spec Coverage Model Checker Algorithm level Formal Spec Abstraction map RT LevelDesign Inputs Simulator
The TLC Model Checker (Yu et. al.) • Explicit-state model checker for TLA descriptions • Stores set of states reached during exploration • For large state spaces, can store a “signature” of a state instead • e.g. projection onto a subset of state variables (a “view” ) • Can generate error-trace to states violating correctness invariants
Validation Guided by Formal Spec • Checks if spec is violated Model Checker High level Formal Spec Abstraction map RT LevelDesign Inputs Simulator
Spec State-Space Model checker (TLC) checks if transition is legal Model checker (TLC) checks if transition is legal fabs : Abstraction mapping Implementation State-Space Formal Spec as Simulation Monitor
Formal Spec + Model Checker as Monitor Benefits • Spec can be analyzed formally • Not true of some popular high-level description languages • More rigorous checking of each simulation run • Discrepancy from spec detected as soon as it occurs • Before it causes observable data corruption • Model checker + formal spec: • More modular: Specification and checking code separate • More reliable, easier to maintain than hand-written monitors • More sophisticated properties can be checked than automatically generatedassertions
Formal Spec + Model Checker as Monitor • Price paid for benefit:Must write abstraction map • Unavoidable: Correctness checking code must reason at more abstract level • Either informally • Existing checking code constructs objects from signals • Or formally, as in our approach • When model checker signals error • Case 1: Error in implementation • Case 2: Error in mapping • Easy to distinguish which from the simulation run • In both cases quality of validation improved • Iterative scheme to debug map and implementation
Validation Guided by Formal Spec Coverage Model Checker Algorithm level Formal Spec Abstraction map RT LevelDesign Inputs Simulator
Transaction 1 Transaction 2 Transaction 3 time Abstraction Map Issues • Protocol transactions appear atomic at spec level • In the implementation they happen • over many clock cycles, • interleaved with other transactions • In the hardware, a collection of lower level actions implement a protocol transaction.
A protocol transaction Preconditions Messages sent Updates tostate time Low-level actions implementing a protocol transaction
Commit point Completion Commit point Completion Completion Commit point time
Delayed Aggregation of Actions Abstract level Concrete level time
More implementation intricacies time Response written to DIFT Arbiter choosesthis DIFT entry Response gets decoded Response arrives fromZbox middle end Memory read requestsent to Zbox middle end
Response written to DIFT Map = composition of two simpler maps Spec level Intermediate level Implementation
Two-part recipe for map Protocol transactions Should be written by system architects Protocol events Should be written by component implementers Hardware signal transitions
Advantages of mapping technique • Modular description • Clean division of responsibilities • Distinguishes hw block integration errors from hw block implementation errors • Easier to maintain the map • Updates to the two components independent • Portions of map re-usable for next generation of design • Mapping technique applicable to other hardware implementing a complex protocol
Validation Guided by Formal Spec Coverage • Collects coverage data • Generates traces to coverage targets Model Checker High level Formal Spec Abstraction map RT LevelDesign Inputs Simulator
Spec State-Space Unexplored spec states Model checker stores visited spec states Implementation State-Space Formal Spec Coverage
Model Checker Measures and Improves Coverage Spec State-Space • Identify parts of spec not exercised by simulation • Path in spec state space = unexamined scenario • Very useful starting point for generating simulation inputs • Scripts convert protocol message sequences to RT-level inputs • Trial and error for getting the timing between messages right • Problem: Spec state space often too large Coveragehole Path generatedby model checker
Coverage Metric on Formal Spec • Problem: Spec state space often too large • Solution: Record and target coverage of selected variables in TLA spec • Explore all combinations of • Memory controller FSM state • Result of cache + victim buffer lookup • Directory state • Type of message • Verification team had collected coverage data on prior simulations using this metric
Demonstration of concept • Selected difficult bug from EV7 bug database • Discovered during prototype testing of 8-processor configuration. • Bug manifestation in the implementation: • Protocol state-machine hits deadlock state • Unexpected victim received at DIFT from victim buffer • No next state defined under this condition • Simulation doesn’t violate spec until deadlock state • Bug manifestation at spec level: • Assertion violation during model checking run • Model checking a 3-processor, 1-address configuration takes < 5 minutes, ~30 MB • Assertion says “no victim in the victim buffer at this state” • Assertion was part of original spec
Advantages of Formal Spec Coverage • Formal spec encapsulates design intent, important architectural features • Full coverage = All scenarios, important structures exercised • Model checker used to • Measure coverage, detect gaps • Generate spec-level traces to reach coverage holes • Spec at same level of abstraction as existing simulation coverage data • “Is this a real coverage gap or an unreachable scenario?” • Can be answered using model checker
Implementation Details • Abstraction map~12K lines of C++ code • Roughly the same size as other, “informal” checking code • No extra price for being formal • Compiled together with compiled-code simulator • Simulator has facility for extra modules being invoked at each cycle • Structure of map code much like composition of two combinational circuits, simulated in event-driven way • ~100% run-time overhead with rudimentary implementation • Efficiency was not a consideration • Model checker takes negligible time
Conclusions • Novel approach uses formal spec and model checker • to monitor simulation • to identify coverage gaps • to guide input generation • Many benefits to having all three be based on same formal spec • Abstraction map required • Provided recipe to make map construction practical • Found valuable by architects and verification engineers • EV8 design started with formal specification first!