1 / 33

Boosting Verification by Automatic Tuning of Decision Procedures

Boosting Verification by Automatic Tuning of Decision Procedures. Domagoj Babi ć joint work with Frank Hutter, Holger H. Hoos, Alan J. Hu University of British Columbia. Decision procedures. Decision procedure. formula. SAT(solution)/UNSAT. Core technology for formal reasoning

hinda
Download Presentation

Boosting Verification by Automatic Tuning of Decision Procedures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Boosting Verification by Automatic Tuning ofDecision Procedures Domagoj Babić joint work with Frank Hutter, Holger H. Hoos, Alan J. Hu University of British Columbia Automatic Tuning

  2. Decision procedures Decisionprocedure formula SAT(solution)/UNSAT • Core technology for formal reasoning • Trend towards completely automatized verification • Scalability is problematic • Better (more scalable) decision procedures needed • Possible direction: application-specific tuning Automatic Tuning

  3. Outline • Problem definition • Manual tuning • Automatic tuning • Experimental results • Found parameter sets • Future work Automatic Tuning

  4. Performance of Decision Procedures • Heuristics • Learning (avoiding repeating redundant work) • Algorithms Automatic Tuning

  5. Heuristics and search parameters • The brain of every decision procedure • Determine performance • Numerous heuristics: • Learning, clause database cleanup, variable/phase decision,... • Numerous parameters: • Restart period, variable decay, priority increment,... • Significantly influence the performance • Parameters/heuristics perform differently on different benchmarks Automatic Tuning

  6. Spear bit-vector decision procedureparameter space Large number of combinations: After limiting the range of double & unsigned After discretization of double parameters 3.78£1018 After exploiting dependencies 8.34£1017 combinations Finding a good combination – hard! Spear 1.9: 4heuristics X 22 optimization functions 2heuristics X 3 optimization functions 12double 4unsigned 4bool ------------------------ 26 parameters Automatic Tuning

  7. Goal • Find a good combination of parameters (and heuristics): • Optimize for different problem sets (minimizing the average runtime) • Avoid time-consuming manual optimization • Learn from found parameter sets • Apply that knowledge to design of decision procedures Automatic Tuning

  8. Outline • Problem definition • Manual tuning • Automatic tuning • Experimental results • Found parameter sets • Future work Automatic Tuning

  9. Manual optimization • Standard way for finding parameter sets • Developers pick small set of easy benchmarks(Hard benchmarks = slow development cycle) • Hard to achieve robustness • Easy to over-fit (to small and specific benchmarks) • Spear manual tuning: • Approximately one week of tedious work Automatic Tuning

  10. When to give up manual optimization? • Depends mainly on sensitivity of the decision procedure to parameter modifications • Decision procedures for NP-hard problems extremely sensitive to parameter modifications • 1-2 orders of magnitude changes in performance usual • Sometimes up to 4 orders of magnitude Automatic Tuning

  11. Sensitivity Example • Example: same instance, same parameters, same machine, same solver • Spear compiled with 80-bit floating-point precision: 0.34 [s] • Spear compiled with 64-bit floating-point precision: times out after 6000 [s] • First ~55000 decisions equal, one mismatch, next ~100 equal, then complete divergence • Manual optimization for NP-hard problems ineffective. Automatic Tuning

  12. Outline • Problem definition • Manual tuning • Automatic tuning • Experimental results • Found parameter sets • Future work Automatic Tuning

  13. Automatic tuning • Loop until happy (with found parameters) • Perturb existing set of parameters • Perform hill-climbing: • Modify one parameter at the time • Keep modification if improvement • Stop when a local optimum is found Automatic Tuning

  14. Implementation: FocusedILS [Hutter, Hoos, Stutzle, ’07] • Used for Spear tuning • Adaptively chooses training instances • Quickly discard poor parameter settings • Evaluate better ones more thoroughly • Any scalar metric can be optimized • Runtime, precision, number of false positives,... • Can optimize median, average,... Automatic Tuning

  15. Outline • Problem definition • Manual tuning • Automatic tuning • Experimental results • Found parameter sets • Future work Automatic Tuning

  16. Experimental Setup - Benchmarks • 2 experiments: • General purpose tuning (Spear v0.9) • Industrial instances from previous SAT competitions • Application-specific tuning (Spear v1.8) • Bounded model checking instances (BMC) • Calysto software checking instances • Machines • 55 dual 3.2 GHz Intel Xeon PCs w/ 2 GB RAM cluster • Benchmark sets divided • Training & test, disjoint • Test timeout: 10 hrs Automatic Tuning

  17. Tuning 1: General-purpose optimization • Training • Timeout: 10 sec • Risky, but no experimental evidence of over-fitting • 3 days of computation on cluster • Very heterogeneous training set • Industrial instances from previous competitions • 21% geometric mean speedup on industrial test set over the manual settings • ~3X on bounded model checking • ~78X on Calysto software checking Automatic Tuning

  18. Tuning 1: Bounded model checking instances Automatic Tuning

  19. Tuning1: Calysto instances Automatic Tuning

  20. Tuning 2: Application-specific optimization • Training • Timeout: 300 sec • Bounded model checking optimization – 2 days on the cluster • Calysto instances – 3 days on the cluster • Homogeneous training set • Speedups over SAT competition settings: • ~2X on BMC • ~20X on SWV • Speedups over manual settings: • ~4.5X on BMC • ~500X on SWV Automatic Tuning

  21. Tuning 2:Bounded model checking instances ~4.5X Automatic Tuning

  22. Tuning 2: Calysto instances ~500X Automatic Tuning

  23. Overall Results Automatic Tuning

  24. Overall Results Automatic Tuning

  25. Overall Results Automatic Tuning

  26. Overall Results Automatic Tuning

  27. Overall Results Automatic Tuning

  28. Outline • Problem definition • Manual tuning • Automatic tuning • Experimental results • Found parameter sets • Future work Automatic Tuning

  29. Software verification parameters • Greedy activity-based heuristic • Probably helps focusing on the most frequently used sub-expressions • Aggressive restarts • Probably standard heuristics and initial ordering do not work well for SWV problems • Phase selection: always false • Probably related to checked property (NULL ptr dereference) • No randomness • Spear & Calysto highly optimized Automatic Tuning

  30. Bounded model checking parameters • Less aggressive activity heuristic • Infrequent restarts • Probably initial ordering (as encoded) works well • Phase selection: less watched clauses • Minimizes the amount of work • Small amount of randomness helps • 5% random variable and phase decisions • Simulated annealing works well • Decrease randomness by 30% after each restart • Focuses the solver on hard chunks of the design Automatic Tuning

  31. Outline • Problem definition • Manual tuning • Automatic tuning • Experimental results • Found parameter sets • Future work Automatic Tuning

  32. Future Work • Per-instance tuning(machine-learning-based techniques) • Analysis of relative importance of parameters • Simplify the solver • Tons of data, little analysis done... Correlations between parameters and stats could reveal important dependencies... Automatic Tuning

  33. Take-away messages • Automatic tuning effective • Especially application-specific • Avoids time-consuming manual tuning • Sensitivity to parameter modifications • Few benchmarks = inconclusive results? Automatic Tuning

More Related