210 likes | 326 Views
Learning Universally Quantified Invariants of Linear Data Structures. Pranav Garg 1 , Christof Loding , 2 P. Madhusudan 1 and Daniel Neider 2 1 University of Illinois at Urbana-Champaign 2 RWTH Aachen, Germany. Black-box learning of invariants.
E N D
Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg1, ChristofLoding, 2 P. Madhusudan1 and Daniel Neider2 1University of Illinois at Urbana-Champaign 2RWTH Aachen, Germany
Black-box learning of invariants • Renewed interest in application of learning to synthesizing invariants [Sharma et al. CAV-12], [Sharma et al. SAS-13], [Kong et al. APLAS-10] Black-box learning of invariants: • Advantages with respect to white-box techniques: - verification of complex program with simple invariants - generalization - apply extremely scalable Machine Learning algorithms for verification. Program Learner check Hypothesis? H (hypothesis) Teacher
Active Learning and Passive Learning Active Learner • Active learning: - learner queries teacher with equivalence and membership queries • Passive learning: - given a sample = (examples, counter-examples), learn the simplest concept Teacher membership/ equivalence yes/no Sample S Learner
Overview • Build active learning algorithms for learning quantified formulas over linear data structures (arrays/lists). - introduce Quantified Data Automata normal form for such invariants. - build active learning algorithm for QDAs. • Build passive learning algorithm using active learning algorithm. - based on an imprecise teacher that answers questions wrt the samples. • Introduce elastic QDAs (EQDAs) that translate to decidable logics. - develop learning algorithms for EQDAs. • List pointed to by head is sorted head 5 7 8 9
Program Configuration/Data words i Program configuration: head 8 9 3 2 4 7 Data word:
Quantified Data Automata • QDAs represent universally quantified properties of linear data structures. y1 Example: head data(y1) <= data(y2) y2
Quantified Data Automata Fix P – program pointer variables Fix Y – set of quantified variables Fix F – numerical abstract domain over data formulas • QDA over linear data structures: - reads a data word annotated with pointers P and Y - checks whether data stored at these positions satisfy a data property • QDA accepts a data word w with pointers P if it accepts all possible extensions of w with valuations for Y. y1 head data(y1) <= data(y2) y2
Valuation words • Valuation word = data word over P + valuation for Y y1 i, y2 i i, y2 Data word head head, y1 head 8 4 8 9 3 2 4 7 3 3 7 8 9 2 4 7 2 9 Valuation words Universal Quantification QDA accepts a data word iff it accepts ALL corresponding valuation words.
Quantified Data Automata • Deterministic, finite, register automata over words - each state labeled with a data formula f • For a valuation word, QDA reads ptr. and univ. vars. and stores the data values in the register reg. • At the final state, QDA checks if these data values satisfy the formula labeling the state. - reg satisfies f(q) Accepts the valuation word - regdoes not satisfy f(q) Rejects the valuation word head head i, y2 i, y2 y1 y1 reg: head 2 y1 4 i 8 y2 8 3 7 4 2 3 9 8 8 4 7 2 9 f(q) = data(y1) <= data(y2)
Learning QDAs • QDAs are finite automata which output data formulas. • Lift Angluin’s L* algorithm for learning DFAs to learn QDAs. • Given a teacher, the unique minimal QDA can be learned in time polynomial in the size of this minimal QDA. y1 head Regular expression outputs data(y1) <= data(y2) data(y1) <= data(y2) y2
Elastic Quantified Data Automata (EQDA) • Subclass of QDAs which translate to decidable logics - Array Property Fragment (APF) [Bradley et al. VMCAI-06] - decidable fragment of Strand over lists [Madhusudan et al. POPL-11] • Cannot test whether two universal vars. are a bounded distance away. y2 y2 y1 y1 outside APF inside APF Restriction for EQDAs: All transitions on blank symbols (no ptr./univ. var) must be self-loops QDA EQDA
Elastic Quantified Data Automata (EQDA) Unique minimal over-approximation theorem: A QDA A can be uniquelyminimallyover-approximated by a language of valuation words that is accepted by an EQDA Ael • The construction of Ael given QDA A is called elastification. • Learning EQDAs <= learning QDAs + elastification. Bel Cel Ael A
Passively learning QDAs PassiveLearner Active Learner Sample S+, S- Given the samples S+ and S-, the teacher uses them to answer the active learner. The teacher wants the active learner to construct a QDA that includes S+ and excludes S-. • Membership query: - if s belongs to S+, return yes - if s belongs to S-, return no - otherwise, return no(errs on keeping the learned concept semantically small) • Equivalence query: - checks if conjectured invariant is consistent with S+ and S- The learned QDA might be non-optimal (usually small). Running time is polynomial in the size of the learned QDA. Teacher
Experiments • Run the program on arrays/lists of small bounded sizes, with data values from a bounded data-domain, eg. {0, 1, 2}, etc. • Extract the concrete data-structures that get manifest at loop headers. • Obtain the set S+ on which passive learning is performed. - fix F to the cartesian lattice of atomic formulas over relations {=, <, ≤} Learn QDAs using Angluin’s algorithm - The learner never asks long membership queries - The teacher, thus, often has correct answers. The learned QDA is over-approximated to an elastic QDA to get a quantified invariant over decidable Strand or APF.
Related Work • Daikon [Ernst et al. ICSE-00] - conjunctive Boolean learning - learns quantified invariants over arrays, to some extent. • Applications of learning in verification - rely-guarantee contracts [Cobleigh et al. TACAS-03, Alur et al. CAV-05] - stateful interfaces [Alur et al. POPL-05] - learning quantified invariants over predicates [Kong et al. APLAS-10] • Machine learning algorithms for invariant synthesis [Sharma et al. CAV-12, SAS-13, ESOP-13]
Conclusion • Learning universally quantified invariants over linear data structures - Quantified Data Automata (QDA) / elastic QDAs - Active learning for QDAs - Unique elastification - Algorithm for passive learning QDAs/EQDAs. - Experimental validation Future Work: • Extensions to trees to capture universally quantified properties like binary-search-tree, max-heap, … • Combining automata based structural learning with machine learning algorithms for learning data formulas Thank You !