Automatic Generation of Programs Using Model Checking and Genetic Programming

Automatic Generation of Programs Using Model Checking and Genetic Programming Gal Katz, Doron Peled, Bar Ilan University

Agenda • Introduction & motivation • Genetic Programming • Program synthesis • Model Checking • Combined method • Application to mutual exclusion • Conclusions & future work

Introduction • Genetic programming • A methodology for automatic programming inspired by Darwinian evolution [Koza 92]. • Used for automatic generation of programs in various fields. • Mostly used for optimization related problems. • Fitness is usually calculated by checking program performance against test cases. • Less used for problems with a strict specification.

Introduction (2) • Model Checking • An automatic formal verification technique used mainly with finite-state software and hardware systems. • Can be used to verify communication and concurrent protocols. • Models are checked against a strict specification. The result is either: • A confirmation that the model satisfies the specification, or • A counterexample of that fact.

Introduction (3) • How to construct a model from the spec.? • Synthesis • Transforms spec. directly to a model that satisfies it. • Complicated. • Currently not practical for automatic program generation. • Brute-force enumeration • All possible programs of a specific domain and size are generated and model-checked. • All existing solutions will eventually be found. • Very time-intensive. Not practical for programs with more than few lines of code.

Our MethodCombining GP & Model Checking User 1. Specification 2. Configuration 6. Final Model / Results GP Engine EnhancedModel Checker 3. Initial population 4. Verification results 5. New programs

Main Steady-state GP Algorithm • Create initial program population. • Randomly choose μ programs. • Create λ new programs by applying genetic operations to the above μ programs. • Calculate fitness function for μ + λ programs, and use it to select μ new programs. • Replace the old μ programs by the selected ones. • Repeat steps 2-5 until either: • a perfect solution is found, or • maximum allowed number of iterations is reached.

while != assign A[ ] 0 A[ ] 1 2 me Program Representation • Programs are represented as trees. • Internal nodes represent expressions or instructions with parameters (assignment, while, if, block). • Terminal nodes represent constants or expressions without any parameter (0, 1, 2, me, other). • Strongly-typed GP is used [Montana 95]. While (A[2] != 0) A[me] = 1

Initial Population Creation • Population usually contains 100 – 1000 programs. • Program are created recursively using the “grow” method [KOZA 92]. • The root is randomly selected from instruction nodes. • Offspring are randomly selected from allowed node or terminals as long as rules are preserved. • If max allowed tree depth is reached, a terminal must be chosen.

Genetic Operations • At each iteration of the GP algorithm, the following genetic operations are applied to the selected programs: • Reproduction – programs are copied without any change • Mutation • Crossover

Mutation Operation • The main operation we use. • Allows performing small modifications to an existing program by the following method: • Randomly choose a program node (internal, or leaf). • According to the node type, apply one of the following operations with respect to the chosen node (strong typing must be kept):

A[ ] 0 Replacement Mutation type (a) while • Replace the sub-tree rooted by node with a new randomly generated sub-tree. • Can change a single node or an entire sub-tree. != assign A[ ] 0 A[ ] 1 2 me While (A[2] != 0) A[me] = A[0] While (A[2] != 0) A[me] = 1

while != assign while while assign A[ ] != != 0 A[ ] 1 block block A[ ] other 2 me assign A[ ] A[ ] 0 0 assign 2 2 2 A[ ] 1 A[ ] 1 me me Insertion Mutation type (b) • Add an immediate parent to the selected node. • Randomly create other offspring to the new parent, if needed. • According to the selected parent type, can cause: • Insertion of code, • Wrapping code with a while loop, • Extending Boolean expressions. While (A[2] != 0) A[me] = 1 While (A[2] != 0) A[2] = other A[me] = 1

Reduction Mutation Type (c) • Replace the selected node by one of its offspring. • Delete the remaining offspring of the node. • Has the opposite effect of the previous insertion mutation, and reduces the program size.

empty while assign != A[ ] 1 A[ ] 0 me 2 Deletion Mutation Type (d) while • Delete the sub-tree rooted by the node. • Update ancestors recursively. != A[ ] 0 2 While (A[2] != 0) A[me] = 1

Crossover Operation • Creates new programs by merging building blocks of two existing programs. • Crossover steps are: • Randomly choose a node from the 1st program. • Randomly choose a node from the 2nd program, that has the same type as the 1st node. • Exchange between the sub-trees rooted by the two nodes, and use the two newly created programs.

empty while assign == A[ ] other other A[ ] 0 me Crossover Example block if assign != A[ ] 1 A[ ] me 2 me A[2] = me a[0] = other If (A[me] != 1) while (a[me] == other) If (A[me] != 1) a[0] = other A[2] = me while (a[me] == other)

Crossover (cont.) • Heavily used by traditional GP [Koza]. • Tries to mimic biological sexual recombination, but • Unlike biology (and unlike GA), GP lacks the notion of “genes” [Banzhaf et al. 01]. • Often acts only as a macro-mutation. • Various methods were developed in order to turn it into a more fruitful operation (Brood, Inteligent crossover). • Still, not a significant operation for small programs like those of Mutual Exclusion.

Selection • At each iteration, selection is applied to all μ + λ programs (over-production selection). • Program are selected using a fitness-proportional (roulette) method [Holland 92]. • “Elitism” is used to ensure that the best program is always selected. • Similar to Evolution Strategies [Rechenberg 94] and Brood Recombination method [Tackett 94] - better protection from harmful operations.

Program Synthesis • Synthesis of finite state system was suggested by Rabin [Rabin, Buchi] • Machinery includes finite tree automata. • Can be solved by finding game strategies [McNaughton, Emerson-Lei]. • For concurrent and distributed systems, the problem is undecideable [Pnueli-Rosner]. • Decidable for special cases, e.g., pipeline architectures [Pnueli-Rosner] in double-exponential time in size of LTL property!

Model Checking

ω-automata • Runs on infinite words, and consist of: • A finite alphabet Σ, • A finite set of states S, • A set of initial states S0  S, • A transition relation Δ  S x S, • A labeling function L : S → ∑, • An acceptance condition Ω. • In this version, the labels are on the states instead of on the arcs.

Acceptance conditions • For a run p, inf(p) denotes the states appearing infinitely on p. • Buchi condition: • A set of states F  S, • A run p over A is accepted if inf(p) ∩ F ≠ Ø • Streett condition: • A set of k pairs (Ei,Fi), 1 ≤ i ≤ k, Ei, Fi S, • A run p over A is accepted if for all pairs: • inf(p) ∩ Ei ≠ Ø→ inf(p) ∩ Fi ≠ Ø.

ω-automata Closure • Buchi automata can be converted into Streett automata, and vice versa. • Both Buchi and Streett automata are closed under intersection and complement. • Streett automata are less simple to use, but are closed under determinization, while Buchi automata are not.

Building Program’s State-graph • Each state consists of values of variables, program counters, buffers, etc. • Edges represent atomic transitions caused by program instructions. • Can be built by a DFS algorithm. • Can be decomposed into SCCs [Tarjan 72].

Converting Model to ω-automaton • We use the states, initial state and transitions of the program’s state-space. • Acceptance condition can allow all runs, or impose fairness conditions. • Streett automata can be used in order to define various fairness conditions (weak & strong).

Safety Properties • Basic properties can be checked by simply analyzing the state graph: • Invariants– can be checked on every visited state. • Deadlocks– states without outgoing edges. • Unreachable code– instructions that are not represented on any transition. • Liveness properties require a more complicated process.

Specification • We use Linear Temporal Logic (LTL) [Pnueli 77] to define specification properties. • LTL formulas are interpreted over an infinite sequences of states, and consist of: • Propositional variables, • Logical connectives, such as  ,  ,  , , and • Temporal operators, such as: • (p)– p will eventually occur. • (p)– p always occurs. • A model M satisfies a formula φ (M╞φ) if every (fair) run of M satisfies φ.

Converting specification to ω-automaton • Every LTL property can be converted into a Buchi automaton with a size exponential to the LTL formula size [Vardi & Wolper 94]. • For deterministic Streett automata, a determinization process is also required [Safra 88]. • May result in a doubly exponential blowup from LTL property.

The Model Checking Process [Vardi & Wolper 86] • Both model and speciation are converted to ω-automata over the same alphabet. • The alphabet is 2AP, where AP denotes a set of atomic propositions that may hold on the system states. • Every word accepted by M (a fair run) should be accepted by the spec, therefore we have to check whether: L(M)  L(φ(.

L(φ) L(M) L(φ) L(M) Model Checking Results • It’s easier to check whether: • L(M) ∩ L(φ( = Ø, or • L(M) ∩ L(φ( = Ø. • Case 1: • Intersection is empty. • M satisfies φ . • Case 2: • Intersection is not empty. • Runs contained in the intersection can be used for generating counterexamples.

Model Checking and GP • Can standard model checking results be used as a GP fitness function? • Yes, but it was done so far with a limited success [Johnson 07]. • A fitness function with just two values is a poor one. • We wish to analyze the model checking graph in order to quantify the level of satisfaction. • We have a specialized model checking algorithm.

Detour: Discriminative Model Checking [Niebert, Peled, Pnueli]

               Linear Temporal Logic   O  U

Computation Tree Logic EG p AF p p p p p p p p p p p . . . . . . . . . . . . . . . . . . . . . . . .

Our point of view • Linear time is sufficient for specifying most properties. • A counterexample is often not enough: • Gives very little clue about the location of the error. • Does not give information about how good and bad executions are related to each other. • Thus, for analysis beyond finding the existence of an error, we promote a “deeper” search.

Our suggestion • Primary or base specification  in LTL, for the base property. • Analysis specification, quantifies over executions that satisfy or do not satisfy the base specification. Syntax:p | \/ |  |  |  |  (and others) Semantics:-  there exists a continuation satisfying the property , where  holds from the beginning. -  there exists a continuation not satisfying the property , where  holds from the beginning.

Semantics illustration Semantics:-  there exists a continuation satisfying the property , where  holds from the beginning. -  there exists a continuation not satisfying the property , where  holds from the beginning.  holds  holds . . . . . . . . . . . .

Examples for specifications • Bad executions depend on infinitely many “bad choices”: ¬<>true • Before executing a, there are good and bad executions. Once a is executed, things things are persistently bad: ((¬Execa/\true)W(Execa/\false)) • Properties such as “from some point all continuations are good/bad”.

How to do model checking? • We need to remember some information about the path so far to verify that with the rest of the computation it is (not) satisfying . • Suppose we would have run a Buchi automaton for , but with nondeterminism, maybe it is running on the wrong branch to be completed. • Thus, we would be running a subset construction (determinization) of the Buchi automaton. • At the point of branching, we continue with a state consistent with one of the Buchi states in the current subset. • Apply CTL* model checking to this structure.

Complexity • EXSPACE-complete even for AG true • Reduction shown for related logic mCTL*[KV LICS 2006] (this logic has different semantics, where quantification always start from the initial state). • But: EXSPACE-complete in size of LTL formula, PSPACE-complete in size of branching formula and the verified code.

Fitness Levels 0.  trueAll executions are bad! •  true /\  trueThere are good and bad executions. •  trueEach bad execution can turn into a good one = there are infinitely many bad choices. •  trueAll executions are good!

Overall Fitness Function • Fitness levels & scores are calculated for each specification property. • How to merge into a single fitness function? • Naïve summing can bias the results, since some properties may be trivially satisfied when more basic properties are violated. • Thus, spec. properties are divided into levels, starting from level 1 for most basic properties. • As long as not all properties at level i are satisfied, properties at higher level gets fitness of 0. • This algorithm also saves running time by skipping unneeded checks.

Parsimony • GP programs tend to grow up over time to the maximal allowed tree size (“bloating”). • Large portions of the code become “introns” (junk DNA). • To avoid that, we use parsimony as a secondary fitness measure. • Number of program nodes * small factor is subtracted from the fitness score. • The factor should be carefully chosen. • Should encourage programs to reduce their size, but • Should not harm the evolutionary process. • Therefore, programs cannot get a score of 100, but only get close to it. The run can be stopped when all properties are satisfied. • Programs can be reduces either by mutations, or directly by detecting dead code by the model checking process, and then removing it.

“Vacuity” (p  q) pq • A special care is needed for implication properties of the form (p  q). • Some (or all) executions may be vacuously satisfied if p never happens. • We are usually interested only on runs when p eventually occurs. • Other runs are neither good nor bad. They are irrelevant. • Thus, in these cases, the program automata is first intersected with the property p. • Some SCC might be marked irrelevant. p pq (p  q) p • If all SCCs are irrelevant, fitness level 0 is assigned. • A similar mechanism is used for excluding unfair runs.

The Mutual Exclusion Problem • Originally described by [Dijkstra 65]. • Many variants and solutions exist. • Modeled using the following program parts: • Non Critical Section • Pre Protocol • Critical Section • Post Protocol • We wish to automatically generate correct code for the pre and post protocol parts.

Spec. Properties • The specification includes the following LTL properties: • The properties are converted into Streett automata.

Runs Configuration • 3 different sets of runs: • The following parameters were used: • Population size: 150 • Max number of iterations: 2000 • μ: 5 • λ: 150

An Example of a Run (1st variant) • Randomly created. • Does not satisfy mutual exclusion property. • Higher level properties are set to 0. Score: 0.0

An Example of a Run (1st variant) • Randomly created. • While loop guarantees mutual exclusion. • Only process 0 can enter the critical section. Score: 66.77

Automatic Generation of Programs Using Model Checking and Genetic Programming

Automatic Generation of Programs Using Model Checking and Genetic Programming

Presentation Transcript

Automatic Synthesis Using Genetic Programming of Improved PID Tuning Rules

Automatic Software Model Checking via Constraint Logic Programming

Applying Model Checking To Large Programs

Model Checking Concurrent Programs

Model-Checking Behavioral Programs

Automatic Program repair using genetic programming

Automatic program repair using genetic programming

Automatic Equivalence Checking of UF+IA Programs

Model Checking Java Programs using Structural Heuristics

Behavioral Consistency of C and Verilog Programs Using Bounded Model Checking

Automatic Software Model Checking via Constraint Logic Programming

Model Checking Programs Using Abstraction

AUTOMATIC GENERATION OF VISUAL PROGRAMMING ENVIRONMENTS

Bounded Model Checking of Concurrent Programs

Model Checking C Programs

Generation of Scenario Graphs Using Model Checking

Model Checking Java Programs

Java PathFinder and Model Checking of Programs

In-Situ Model Checking of MPI Parallel Programs

Java PathFinder and Model Checking of Programs

Applying Model Checking To Large Programs

Model Checking Java Programs