510 likes | 679 Views
Verification of Hierarchical Cache Coherence Protocols for Future Processors. Student: Xiaofang Chen Advisor: Ganesh Gopalakrishnan. Outline. Background Proposed solutions High level hierarchical coherence protocol verification
E N D
Verification of Hierarchical Cache Coherence Protocols for Future Processors • Student: Xiaofang Chen • Advisor: Ganesh Gopalakrishnan
Outline • Background • Proposed solutions • High level hierarchical coherence protocol verification • Refinement check: specifications vs. RTL implementations • Conclusion
Hierarchical Cache Coherence Protocols Chip-level protocols Intra-cluster protocols … mem mem dir dir Inter-cluster protocols
Modeling and Verification of Coherence Protocols • High-level modeling approaches • Model checking • Low-level modeling: RTL or VHDL • Simulation
Problems with Hierarchical Coherence Protocols • For high level modeling • Handle the complexity of hierarchical protocols • For RTL implementations • Verify a RTL correctly implements the specification
Example: Verification Complexity (I) Remote Cluster 1 Home Cluster Remote Cluster 2 L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L2 Cache+Local Dir L2 Cache+Local Dir L2 Cache+Local Dir RAC RAC RAC Global Dir Main Mem
Example: Verification Complexity (II) • Tool: Murphi • Verification • IA-64 machine • 18GB memory • 40-bit hash compaction • Non-conclusive after >30 hours of state enumeration
Differences in Modeling: Specs vs. Impls Multiple steps in low-level One step in high-level 1.3 1 1.1 1.2 1.4 buf client home local cache 1.5
Differences in Execution: Specs vs. Impls 1 2 3 Interleaving in HL 1.2 1.1 Concurrency in LL 1.3 2.1 2.2 3.1 3.3 3.2
Proposed Mechanisms • For high level modeling, develop • A few M-CMP coherence protocols • A compositional approach • For specifications vs. implementations, develop • A formal theory • A compositional approach • A practical tool
Outline • Background • Proposed solutions • High level hierarchical coherence protocol verification • Refinement check: specifications vs. RTL implementations • Conclusion
An M-CMP Benchmark Protocol Intra-cluster Remote Cluster 1 Home Cluster Remote Cluster 2 L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L2 Cache+Local Dir L2 Cache+Local Dir L2 Cache+Local Dir RAC RAC RAC Global Dir Main Mem Inter-cluster
Protocol Features • Both levels use MESI protocols • Intra-cluster: FLASH • Inter-cluster: DASH • Silent drop on non-Modified cache lines • Network channels are non-FIFO • Inclusive caches
Another Benchmark: Non-inclusive Caches Remote Cluster 1 Home Cluster Remote Cluster 2 L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L2 Cache+Local Dir L2 Cache+Local Dir L2 Cache+Local Dir RAC RAC RAC Global Dir Main Mem
Our Compositional Approach Original protocol
One Way to Decompose Protocols • Create three abstract protocols • Each with 1 detailed cluster + 2 abstracted clusters
Abstract Protocol #1 Home Cluster L1 Cache L1 Cache Remote Cluster 2 Remote Cluster 1 L2 Cache+Local Dir’ L2 Cache+Local Dir’ L2 Cache+Local Dir RAC RAC RAC Global Dir Main Mem
Abstract Protocol #2 Remote Cluster 1 L1 Cache L1 Cache Remote Cluster 2 Home Cluster L2 Cache+Local Dir’ L2 Cache+Local Dir’ L2 Cache+Local Dir RAC RAC RAC Global Dir Main Mem
Problems with This Approach • Every abstract protocol contains 2 protocols • Duplicated behaviors in abstract protocols • State space still large # of states Time (hour) Mem (GB) M1 284,088,425 12 18 M2 636,613,051 18 18
Second Way to Decompose Protocols Remote Cluster 1 Home Cluster L1 Cache L1 Cache L1 Cache L1 Cache ABS #1 ABS #2 L2 Cache+Local Dir L2 Cache+Local Dir Remote Cluster 1 Home Cluster Remote Cluster 2 L2 Cache+Local Dir’ L2 Cache+Local Dir’ L2 Cache+Local Dir’ RAC RAC RAC Global Dir Main Mem ABS #3
Details of Our Approach • Abstraction • States • Transitions, properties • Constraining • Assume guarantee reasoning
Abstraction on States Intra-cluster Inter-cluster
State Representation Original cluster L1 Cache L1 Cache RAC L2 L1s Network Local Dir L2 Cache+Local Dir RAC L1 Cache L1 Cache L2 L1s Network Local Dir L2 Cache+Local Dir L2 Cache+Local Dir’ RAC L2 Local Dir’ RAC Abstract clusters
Abstracting Transitions and Properties • Rule:guard action • guard • Become more permissive • action • Allow more behaviors
An Example of Abstraction Abstract intra-cluster protocol L1 Cache L1 Cache Clusters[c].WbMsg.Cmd = WB Clusters[c].L2.Data := Clusters[c].WbMsg.Data; Clusters[c].L2.HeadPtr := L2; … WB L2 Cache+Local Dir RAC True Clusters[c].L2.Data := nondet; … L2 Cache+Local Dir’ RAC Abstract inter-cluster protocol
An Example of Constraining True & Clusters[c].L2.State = Excl Clusters[c].L2.Data := nondet; … L2 Cache+Local Dir’ RAC L1 Cache L1 Cache Clusters[c].WbMsg.Cmd = WB Clusters[c].L2.State = Excl WB L2 Cache+Local Dir RAC
Non-inclusive Protocols: History Variables Remote Cluster 1 Home Cluster L1 Cache L1 Cache L1 Cache L1 Cache L2 Cache+Local Dir L2 Cache+Local Dir Remote Cluster 1 Home Cluster Remote Cluster 2 L2 Cache+Local Dir’ L2 Cache+Local Dir’ L2 Cache+Local Dir’ RAC RAC RAC Global Dir Main Mem
Outline • Background • Proposed solutions • High level hierarchical coherence protocol verification • Refinement check: specifications vs. RTL implementations • Conclusion
Our Approach • Use a hardware language • Hardware Murphi • Develop a formal theory of refinement check • Develop a compositional approach • Abstraction • Assume guarantee • Develop a practical tool
Hardware Murphi • Murphi extension by S. German and G. Janssen • A concurrent shared variable language • On each cycle • Multiple transitions execute concurrently • Exclusive write to a variable • Shared reads to variables • Write immediately visible within the same transition • Write visible to other transitions on the next cycle • Support transactions, signals, etc
Transaction • Group multiple steps in impl Transaction Rule-1 …. … Rule-6 … End; 4 2 1 5 3 6
Workflow of Our Refinement Check Murphi Spec model Muv Property check Product model in Hardware Murphi Product model in VHDL Hardware Murphi Impl model Check low-level correctly implements high-level
Full List of Assertions for Refinement Check • Serializability for specifications • No write-write conflicts • Initial states containment • Write set variables containment • Enableness for specifications • Joint variables match at the end of transactions
An Example Impl transaction Transaction Rule-1 guard1 action1; Rule-2 guard2 action2; Rule-3 guard3 action3; End; Spec rule Rule spec_guard spec_action;
An Example (Cont’d) Transaction Rule-1 guard1 action1; assert spec_guard; spec_action; Rule-2 guard2 action2; Rule-3 guard3 action3; End; assert impl_var1 = spec_var1; assert impl_var2 = spec_var2; …
Driving Benchmark Dir Cache Mem Local Buf Home Buf Remote Buf Router Dir Cache Mem Local Buf Home Buf Remote Buf S. German and G. Janssen, IBM Research Tech Report 2006
Bugs Found with Refinement Check • Benchmark satisfies cache coherence already • Bugs still found • Bug 1: router unit loses messages • Bug 2: home unit replies twice for one request • Bug 3: cache unit gets updated twice from one reply • Refinement check is an automatic way of constructing checks
Model Checking Approaches • Monolithic • Straightforward property check • Compositional • Divide and conquer Monolithic Product model in VHDL Compositional
Compositional Refinement Check • Reduce the verification complexity • Basic Techniques • Abstraction • Removing details to make verification easier • Assume guarantee • A simple form of induction which introduces assumptions and justifies them
In More Detail • Abstraction • Change variables to free input variables • E.g. change a latch to free input signal • Assume guarantee Assume for reads of a transaction (spec.Var = impl.Var) holds
Experimental Results • Configurations • 2 nodes, 2 addresses, SixthSense VerificationTime 1-day Monolithic approach Compositional approach 30 min Datapath 1-bit 10-bit
Outline • Background • Proposed solutions • High level hierarchical coherence protocol verification • Refinement check: specifications vs. RTL implementations • Conclusion
Related Work • Parameterized verification • Chou et al. • Bluespec • Arvind et al. • Aggregation of distributed actions • Park and Dill • Compositional verification • Many previous works including McMillan, Jones, etc.