250 likes | 389 Views
Mutation Analysis with Coverage Discounting. Peter Lisherness , Nicole Lesperance , and Kwang -Ting (Tim) Cheng. University of California – Santa Barbara. Motivation. Functional Coverage. + Functionally Meaningful + Evaluates design activation Ignores propagation and
E N D
Mutation Analysis with Coverage Discounting Peter Lisherness, Nicole Lesperance, and Kwang-Ting (Tim) Cheng University of California – Santa Barbara
Motivation Functional Coverage + Functionally Meaningful + Evaluates design activation • Ignores propagation and • checker quality Coverage Discounting • + Functionally Meaningful • + Evaluates design activation, propagation and checkers Mutation Analysis • + Evaluates propagation and checker quality • - Mutants hard to analyze
Coverage Discounting: The Main Idea Use fault insertion to discount (revise downward) coverage scores Coverage Score Functional Coverage 100% Discounted Coverage Score Coverage Discounting 0% Mutation Analysis Detected + undetected mutants Undetected Mutants
An Existing Solution: Mutation Analysis • Benefits • Evaluates quality of testbench checkers • Indicates if tests fail to propagate potential errors to the checker or fail to activate mutants Add/Fix Tests Detected DUT Checker Fix Undetected
Mutation Analysis Drawbacks • Long runtime for simulation • Many man-hours required to analyze results: 2.1 Mutants are synthetic - difficult to identify testbench improvements to target mutant detection 2.2 Redundant mutants never detectable: Some undetected faults are redundant if (a >=0) b=1+a; else b=1-a; if (a >0) b=1+a; else b=1-a; Example:
Premise of Coverage Discounting 1) A mutant may change design functionality 2) A coverpoint, covered by testbench in the original design, may be suppressed (i.e. no longer be covered) in the mutated design 3) If the mutant suppressing the coverpoint is not detected, then the coverpoint is no longer considered covered in the original design
Coverage Discounting Flow Tests Detected DUT Checker Undetected Add more tests Fix Coverage Changed? Yes No
An Example Original Design if (remaining_length >= 5) begin burst_en=true; length_next:=remaining_length - 4 end else begin burst_en=false; length_next:=remaining_length- 1 end ... BURST_MODE : coverpoint burst_en; Number of packets sent in burst mode increased from 4 to 5 in the condition, but designers forgot to increase packet number everywhere else
An Example Mutated Design if (false) begin burst_en=true; length_next:=remaining_length - 4 end else begin burst_en=false; length_next:=remaining_length- 1 end ... BURST_MODE : coverpoint burst_en; *The mutant goes undetected because the checkers are only sensitive to bus content and ordering, not timing
An Example • Since the coverpoint BURST_MODE is suppressed by the mutant and the mutant is undetected, BURST_MODE is discounted • It is now clear that the checker must check for timing in order to adequately cover burst mode functionality • If timing is checked, the real bug is caught
Experimental Results Experimental Results
OpenRISC CPU • Experiment #1: Measure test quality • Simulation Infrastructure • Functional simulator (ISS) • Ad-hoc fault insertion • Tests: Instruction-Set v. Random • Coverpoints: Opcodes • Experiment #2: Identify checker weaknesses • Simulation Infrastructure: • SoC full-chip RTL simulation • Certitude fault insertion • Tests: SoC Regression Suite • Coverpoints: CPU top-level signals
MeasureTest Quality Functional Tests Original Discounted Random Tests Original Discounted
UART • Illustrates success of coverage discounting applied to a sophisticated testbench • Simulation Infrastructure: • Design RTL w/OVM Testbench • SystemVerilogcoverpoints (ModelSim) • Certitude fault insertion • Tests: OVM Sequences • Coverpoints: Hand-written spec-based
UART Results • Reduces debug effort from analysis of 146 mutants to examining 3 functional coverpoints • 1588 Mutants: • 7 not activated • 106 not propagated • 33 not detected • Total 146 mutants demand attention • 846 Coverpoints: • 4 uncovered • 3 discounted • All discounted relate to specific unchecked functions • Loopback, timeout interrupt identification register
Results Examined • Coverage discounting: • Identified "good" tests (propagation) • Diagnosed checker problems • Identified coverpoints vacuously covered • However: • Runtime of mutation analysis still an issue • Technique sensitive to the quantity and quality of faults
Confidence Metric • Aims to answer: • Is a coverpoint adequately challenged by the set of mutants? • When have enough faults been inserted (when can simulation be stopped)? • What is the optimal simulation ordering of mutants for coverage discounting?
Detection Confidence (DECO) Score • Discounting relies on coverpoints being suppressed by faults • Point confidence: the number of times a coverpoint is suppressed • DECO(n): computes the percentage of coverpoints with point confidence greater than n
DECO-Directed Ordering • Optimize test/mutant simulation order to increase DECO score • Test Selection: Choose test covering the most low confidence points • Mutant Selection: Select mutant activated by the fewest tests
Quicker Confidence CPU UART
Quicker Discounting CPU UART
Summary – DECO • Provides feedback earlier • Improves confidence (both over time and at termination) • Enables early termination
Summary • Analyzed information gained from coverage discounting for two designs • Developed a confidence metric to gauge mutant effectiveness and an ordering heuristic to reduce runtime