260 likes | 385 Views
Dynamic Verification of Sequential Consistency. 2 nd Year Project Progress Report Albert Meixner. Introduction. Memory Consistency Model Specifies the behavior of the memory system Sequential Consistency Verifying Memory Consistency = Verifying Correctness of the Memory System
E N D
Dynamic Verification of Sequential Consistency 2nd Year Project Progress Report Albert Meixner
Introduction • Memory Consistency Model • Specifies the behavior of the memory system • Sequential Consistency • Verifying Memory Consistency =Verifying Correctness of the Memory System • End-to-end approach • Detects failures in all components • "...the results of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program." Leslie Lamport
Outline • Introduction • DVSC-Direct • DVSC-Indirect • Results • Relaxed Consistency Models
DVSC-Direct • Construct total order of memory accesses • All CPUs send an Inform message for every load and store operation to Verifier • Verify that total order is consistent • Replay accesses in logical time order • Verify that every load received value from most recent store • Optimizations • Compress Informs, eliminate Informs that can be locally verified • Verification Memory Cache for replay storage
DVSC-Direct CPU CPU CPU Cache Inform Log Cache Inform Log Cache Inform Log Memory VWB Memory VWB Memory VWB Inform Verifier Inform Verifier Inform Verifier VMC VMC VMC
DVSC-Direct Summary • Fully End-To-End • Only requires a logical timebase • Excessive Bandwidth requirements • About 400% increase in bandwidth usage • Probabilistic • About 5% of memory accesses not verified
DVSC-Indirect Idea • Verify conditions known to be sufficient for Sequential Consistency • A load bound to transaction T receives the value of • The most recent store bound to T • The initial value of the block received in response to T • Exclusive epochs for block B do not overlap other epochs for block B • Every load/store operation is contained in an epoch and bound to the transaction that started the epoch • Stores are contained in exclusive epochs • Each word w of a block B received at the beginning of an epoch equals the most recent store to w • Plakal et al.
DVSC-Indirect • Epoch: Time interval between receiving and loosing permissions on a block • Exclusive or Shared • Send information about every epoch to History Verifier • Verify • Exclusive epochs do not overlap any other epoch • Block data is propagated correctly • Cache accesses are contained in appropriate epochs • DIVA • Verify memory access order equivalent to program order
DVSC-Indirect Epochs GETS GETS GETX CPU 1 12 14 14→7 GETS Consistency Violation 14 CPU 2 GETX 12→14 14 CPU 3 Exclusive Shared
DVSC-Indirect CPU CPU CPU DIVA DIVA DIVA Cache CET Cache CET Cache CET Memory VWB Memory VWB Memory VWB History Verifier History Verifier History Verifier MET MET MET
DVSC-Indirect Summary • Coherence Protocol dependent • Bandwidth overhead is less than 25% • Covers all memory operations • No modification to CPU
Simulation Results • Implemented DVSC-Indirect in Ruby • MOSI-Directory • MOSI-Snooping • Simulated DVSC using full-system simulation of an 8-way SPARC SMP • Out-Of-Order Processor • Detailed memory system simulation • 2 GB RAM, 4-way 32KB I+D L1, 4-way 1MB L2 • Safety-Net
Relaxed Consistency Models • Expand DVSC-Indirect to other models • Total Store Order, Weak Consistency • Idea • Use 3 different orderings • p Program Order • c Cache Order • g Global Order • Use On-Chip DVSC-Direct to verify that re-orderings between p and c are legal • Use DVSC-Indirect to verify that there are no re-orderings between c and g
DVRC CPU CPU CPU VMC VMC VMC Order Verifier Order Verifier Order Verifier CET CET CET Cache Cache Cache Memory VWB Memory VWB Memory VWB History Verifier History Verifier History Verifier MET MET MET
DVRC – Implementation CPU • CPU modifications • ID assign sequence number to memory ops • MEM remember results of load operations • COMMIT replay memory op in VMC • Store Buffer • Needs EDC • Stores can not leave before they were committed • VMC • Must contain all addresses with stores executed in pipe but not in cache (i.e. store in store buffer)
DVRC – Reorder Check • Store the highest sequence number seen for 3 operations • Membar (seqMB), Store (seqST) and Load (seqLD) • For every operation reaching the cache, perform reorder check • LD seq > seqST (performed STs precede LD in p)seq > seqLD • ST seq > seqST seq > seqMB • MB seq > seqLD • For ST check seq < seq of last committed ST
DVRC Summary • If it works…. • Works for all relaxed consistency models by adjusting reorder check • Same bandwidth overhead as DVSC-Indirect • Little additional hardware • Methodology • Strict argument that it verifies TSO • Implement in Ruby/Opal for simulation
DVRC and Wisconsin TSO • Wisconsin TSO store operations • STpriv Store in MEM • STpublic Store when it reaches cache • Conditions • X and Y are LD or STpriv X<pY iffX<gY • If either is a ST guaranteed by VMC • If both LD guaranteed by Order Verifier • For every ST STpriv <g STpublic • Guaranteed by reorder check
Wisconsin TSO Conditions • X,Y are ST: X<pY → X<gY • X<pY → X<cY enforced by reorder check • X<cY → X<gY enforced by block history verification • ST<pMB<pLD → ST<gMB<gLD • ST<pMB<pLD → ST<cMB<cLDby reorder check • ST<cMB<cLD → ST<gMB<gLDby history verification • Value returned by LD equals • Most recent STpriv, if ST<pLD and LD<gSTpub • Enforced by VMC check • Most recent STpublic • Enforced by correct ordering of stores (cond. 3), EDC on cache and store buffer and epoch ordering and epoch checksum
Epoch Messages • Epoch messages can appear in 3 formats Shared Epoch Type Shared Open false Begin Time Begin CRC End Time Exclusive Epoch Type Exclusive Open false Begin Time Begin CRC End Time Open Epoch Type Ex/Shared Open true Begin Time Begin CRC
CET Implementation • One epoch entry per cache line • Epoch start • Epoch checksum • Exclusive/Shared bit • Data Ready bit • Timeout FIFO too avoid wrap-around • After 2tsbits-1 send open epoch • Operation in Snooping • Observe own GETX/GETS start epoch • Observe other GETX/GETS send epoch
MET Implementation • Order incoming epochs by start time • 256 entry reorder buffer (VWB) • One history entry for every block in a CPU cache • Latest end of exclusive/shared epoch • Checksum after latest exclusive epoch • CPUs with open shared epochs • CPU with open exclusive epoch • Check all incoming epochs • Begin checksum matches MET checksum • Shared: No open exclusive, starts after last exclusive • Exclusive: No open epochs, starts after last shared or exclusive