941 likes | 1.16k Views
Integrated Logic Systems. Part 1 WAM and Implementations of Prolog. Aim of the course. Implementation of Logic Based Systems Implementation of Prolog Warren’s abstract machine (known as the WAM) as the most widely used basis for implementing Prolog systems Other Systems
E N D
Integrated Logic Systems Part 1 WAM and Implementations of Prolog
Aim of the course • Implementation of Logic Based Systems • Implementation of Prolog • Warren’s abstract machine (known as the WAM) as the most widely used basis for implementing Prolog systems • Other Systems • PETISCO (and connection to databases and web) • InterProlog (and interface Java/Prolog) • Gnu-Prolog (and contexts and constraints)
Why? • Learn how to implement logic systems • Understand specific behavior of Prolog and relate to efficient execution of Prolog programs • Get acquainted with systems that integrate Prolog and other systems (databases, constraints, Java, web interfaces).
The WAM • David H. Warren’s Abstract Machine is the standard basis for implementing Prolog systems. • Hassan Aît-Kaci. Warren’s Abstract Machine – A Tutorial Reconstruction. MIT Press, 1991.
WAM’s Basic Picture • Prolog programs are compiled into WAM code. • Prolog queries are compiled into WAM code. • The code for queries (and for program clauses) is executed and calls the code for program clauses.
Structure of the course • WAM is presented starting from very simple programs: • L0: Program is simply one facts, and the query is atomic (learn unification) • L1: Program is a set of facts (only one per predicate), and the query is atomic • L2: Programs are sets of rules but only one per predicate – Prolog without backtracking (learn execution of clauses) • L3: Pure Prolog Programs (learn backtracking) • Optimizing the design • Efficiency issues
The Language L0 • A program is a term • A query is a term • Semantics: • The query q fails wrt to program f iff q and f do not unify • Otherwise q succeeds binding its variables • Example: p(f(X),h(Y,f(a)),Y). ?- p(Z,h(Z,W),f(W)) Z= f(f(a)) W = f(a)
The Heap in WAM0 • WAM0 contains an addressable Heap (an array of data cells) for storing terms • Terms are either: • Variables • Identified by a reference pointer to a single heap cell containing <REF,k> where k is the address • Structures of the form f(@1,…, @n) • Represented by n+2 cells • The 1st is a reference to the 2nd (stating it is a structure STR) • The 2nd contains the functor and arity • Each of the others contains one argument
Heap representation of terms • Unbound Variable i • Bound Variable i i ≠ j • Term f(t1,…tn) i i+1 i+2 i+n+1
Compiling an L0 query • Preparing on side of the (unification) equation to be solved • A query is translated into a sequence of instructions to build its heap representation • For storing intermediate values, WAM0 has registers X1, X2,… • Register X1 is assigned to the outermost term • A term is a set of flattened equations: Xi = f(Xi1,…, Xin)
Compiling an L0 query (cont) • Equations are then sorted such that registers in the right are defined before • Registers simply for variables are omitted • Example: • Equations for p(Z,h(Z,W),f(W)) and sorting X1 = p(X2,X3,X4) (3rd) X2 = Z X3 = h(X2,X5) (1st) X4 = f( X5) (2nd) X5 = W
Compiling an L0 query (cont) • Equation Xi = f(Xi1,…, Xin) is translated into • put_structure f/n, Xi • set_variable Xi1 or set_value Xi1 • … • set_variable Xin or set_value Xin • Compilation of p(Z,h(Z,W),f(W))
Compiling an L0 program • Executing a fact given a query amounts to check whether the term in the heap unifies with the fact • Proceed checking if the fact matches the heap – read mode • get_structurerather than put_structure • unify_variablerather than set_variable • unify_valuerather than set_value • If an unbound variable (REF) is found in the heap, proceed by writing the unifyable term of the fact – write mode similar to processing of the query
Compiling an L0 program (cont) • The fact is transformed into a sequence of flat equations on registers, where the order is inverse of that for the query. • Example: p(f(X),h(Y,f(a)),Y).
unify instructions • unify_variable Xi • In write mode is as set_variable Xi • In read mode just sets Xi with the next subterm to be matched (whose heap address is stored in global register S). • unify_value Xi • In write mode is as set_value Xi • In read mode checks whether the term referenced by Xi unifies with that referenced by S • A function for dereferencing (chain) references for variables is needed:
get_structure f/n, Xi • If Xi references an unbound variable, then start at write mode so as to store in the heap the new term (in the fact) • If Xi references a structure for the same functor then, in read mode, start matching subterm (setting S to the first one) • Otherwise, fail unification.
get_structure f/n, Xi For now, consider that bind(addr1,addr2) binds address 1 with <Ref,addr2>
unify instructions • The unify(addr1,addr2) remains to be explained. It uses a unification stack (Push-Down List).
Summary structure of WAM0 • X1, …, Xn registers • Special registers S and H • Heap (for storing terms) • PDL for use in unification • Instructions set: • put_stucture f/n, Xi • set_vatiable Xi • set_value Xi • get_structure f/n, Xi • unify_variable Xi • unify_variable Xi S H
The Language L1 • A query is an atom (term) • The program is now a set of atoms • Semantics: • The query q succeeds wrt to program P iff there exists no atom f in P such that q and f unify • Otherwise q succeeds binding its variables • Extends L0 in that more than one (compiled) fact, for more than one predicate, is stored in the code area • Code for fact p/n is stored at an address labeled @(p/n). • By calling the code in the appropriate address, there is no need for representing the p/n in the Heap
Calling code in L1 • There must be a register P with a pointer to the next instruction to be called • Every instruction, unless otherwise stated, increments P • There exists an unconditional jump to call the code for a predicate: • call p/n simply defined by P ← @(p/n) • There exist a control instruction proceed to denote the end of the code for p/n that for now is simply a no-op code terminator
Root arguments and registers • The first n registers will be used to store the n arguments of the predicate • To differentiate, they will be called argument registers A1, …, An. • Other registers are use as before • The predicate functor is not stored • Example: • Equations for p(Z,h(Z,W),f(W)) A1 = Z A2 = h(A1,X4) A3 = f(X4) X4 = W
Handling of root variables • Now variables must be loaded into argument registers for queries, and extracted from argument registers for facts • New instructions: • put_variable Xn, Ai • (for handling with the first occurrence of a root variable in a query) • put_value Xn, Ai • (for subsequent occurrences of a root variable in a query) • get_variable Xn, Ai • (for handling with the first occurrence of a root variable in a fact) • get_value Xn, Ai • (for subsequent occurrences of a root variable in a fact)
Summary structure of WAM1 • X1, …, Xn and A1, … An registers • Special registers P, S and H • Heap • PDL • Code area • Instructions set: • put_stucture f/n, Xi • set_vatiable Xi • set_value Xi • get_structure f/n, Xi • unify_variable Xi • unify_variable Xi • call p/n • proceed • put_variable Xn,Ai • put_value Xn,Ai • get_variable Xn,Ai • get_value Xn,Ai P S H
The Language L2 • Besides fact, programs may now have rules • At most one rule for each predicate: • No backtracking yet!! • Formally, an L2 program is a set of rules: a0 :- a1, …, an (n ≥ 0, and ais are atoms) • At most one rules per predicate • Rules with one body goal are called chain rules, and with more, deep rules • L2 queries are sequences of goal: ?- a1, …, ak (k ≥ 0, and ais are atoms) • Called empty query, if k = 0.
The Language L2 • Besides fact, programs may now have rules • At most one rule for each predicate: • No backtracking yet!! • Formally, an L2 program is a set of rules: a0 :- a1, …, an (n ≥ 0, and ais are atoms) • At most one rules per predicate • Rules with one body goal are called chain rules, and with more, deep rules • L2 queries are sequences of goal: ?- a1, …, ak (k ≥ 0, and ais are atoms) • Called empty query, if k = 0.
Execution of L2 programs • Executing ?- a1, …, ak amounts to repeated application of leftmost resolution, until empty query or failure. • Start by unifying a1 with the head of the (only) rule for the predicate • If none exists, then fail • Otherwise, replace a1 by the body of the rule, after unification • Proceed until empty query (or failure) • The result of a successful query is the (dereferenced) bindings of the original variables in the query. • We can start to view it as concatenation of compiled WAM1 code for each goal in the query • Care must be taken wrt: • Continuation of execution of a goal sequence • Avoid conflicting use of arguments registers
Compiling facts in L2 • Very similar to compiling L1 fact, but… • In the end, the proceed can no longer stop execution • It has to return to the calling procedure (query) • WAM2 has an extra global register CP
Compiling rules in L2 • General structure is similar to a combination of compiling facts and queries into WAM1: • get arguments of the head (as in a WAM1 fact) • put arguments of the body atoms (as in a sequence of WAM1 queries) • Roughly, a0 :- a1, …, an translates into: • Get arguments of a0 (from A registers) • Put arguments of a1 (in A registers) • Call a1 • … • Put arguments of an (in A registers) • Call an
Permanent variables • Variables used more than once in a rule (permanent variables) cannot be accessible by an argument register only • They would be overwritten! • In a rule a0 :- a1, …, an a variable Y is permanent iff • Y occurs in more than of the sets: Vars(a0) U Vars(a1), Vars(a2), … Vars(an) • Other variables are called temporary • There is no need to store these elsewhere!
Permanent variables example • Consider rule: p(X,Y) :- q(X,Z), r(Z,Y), s(Z,W) • Permanent variables are: • Z: Need to store it to pass from q to Z and then to s • Y: Need to store it from the call to pass to r • Temporary variables are: • X: No need to store, since it is (only) immediately used after the call • W: No need to store, since it is never used elsewhere
WAM2 Stack • Permanent variables must be stored (environment). • get_variable/put_variable instructions must get/put these variables in the environment • Since the same predicate can be called more than once • permanent variables must be stored in a Stack of environments. • Continuation point must also be stored in environment • Global register E stores top of Stack • allocate/deallocate instructions for building environment before executing rule, and freeing it after execution
Environment frame • For each rule, with n permanent variables, the environment to be push into the Stack is:
Allocate/deallocate environments • Just fill in (resp. free) an environment:
Compiling an L2 query • Just as compiling a rule, but without the part of the head • It must allocate an environment, with an initial top of stack, an initial continuation point, and the n permanent variables of the query (in case of a conjunction).
Example of compilation into WAM2 • Compilation of p(X,Y) :- q(X,Z), r(Z,Y).
P CP S E H Summary structure of WAM2 • X1, …, Xn and A1, … An registers • Special registers E, CP, P, S, and H • Code area • Heap • Stack • PDL • Instructions set, as for WAM1 except: • call p/n redefined • proceed redefined • New instruction allocate N • New instruction deallocate
The Language L3 • L3 simply is Pure Prolog • In other words, it is as L2 but now with possibly more than one rule per predicate • It has to deal with backtracking! • Failure no longer means abortion of the whole process • Alternatives must be considered • Uses a top-down left most resolution • Chronological backtracking: when failure, the latest choice is reexamined first.
Choice points • If a rule has alternatives, the state of computation must be saved • Only this way we can guarantee that it is possible to restore it if backtracking is needed • Such a state is called a choice point • First idea: store them in a separate Stack • Environments: AND-Stack • Choice points: OR-Stack
Environment Protection a :- b(X), c(X). b(X) :- e(X). c(1). e(X) :- f(X). e(X) :- g(X). f(2). g(1). Environment for f Environment for e Environment for b Environment for c Environment for a AND Stack Choice point for e Or Stack Ups! And now where is the state of arguments before the call of e/1? The choice point must protect from deallocation all environments whose creation precedes that of the existing choice point.
B E E E E E E E E E Environment Protection • Use only one stack (both for environments and choice points). • E register points to top of the stack. • New B register point to last choice point a :- b(X), c(X). b(X) :- e(X). c(1). e(X) :- f(X). e(X) :- g(X). f(2). g(1). Environment for c Environment for f Environment for e Environment for e Choice Point for e Environment for b Environment for b Environment for a Stack
What to stored in a choice point? • Information for unbinding variables. • I.e. making variables free again. • Which variables? • Put this information in another stack, and put in the choice point the pointer to the top of that stack when called. • New (and last!) data area: Trail • New register TR: Trailpointer • New register HB to store the value of H at the last choice point: • Only bindings before, i.e. less than, HB need to be stored in Trail.
Choice point frame • Each choice point pushes into the Stack: Needed because put instructions below may overwrite them For protecting environment (and then freeing it) Where to continue upon proceed What is the previous choice point, in case all rules fail Code pointer of next rule of this predicate When backtracking, remove bindings at the trail, up to here Needed for freeing Heap space after failure
New allocate/deallocate instruction • Must also take care of choice points:
Compiling rules in L3 • Extends that of WAM2 now with instructions for dealing with choice points • Roughly, a set of rules is translates into: • In the first rule (if more exist) create choice point • Allocate environment (if it is a rule, i.e. not a fact) • Compilation of 1st rule as before • End with deallocate (if needed) and proceed • In the second rule (if more exist) reset information from first rule • Compilation of 2nd rule as before • … • In the last rule reset information from previous and reset B to previous (remove choice point) • Compilation of last rule as before