470 likes | 599 Views
Environments for Tabled Prolog: Progress and Open Issues. Terrance Swift CENTRIA Universidade Nova de Lisboa. Motivation * Tabling has proven its usefulness over the past decade -- cf. How tabling solves real problems www.cs.sunysb.edu/~tswift/talks.html
E N D
Environments for Tabled Prolog: Progress and Open Issues • Terrance Swift • CENTRIA • Universidade Nova de Lisboa
Motivation • * Tabling has proven its usefulness over the past decade • -- cf. How tabling solves real problemswww.cs.sunysb.edu/~tswift/talks.html • * Tabling is supported in XSB, YAP, B Prolog; packages have been developed or are being developed for Mercury, ALS, Ciao • * User environment issues have received some attention, but there are many open issues in analysis and language design • * Unlike tabling implementation, environment issues do not depend on engine-level coding, and so results may be applicable to several systems and may require less specialized knowledge to implement • * Talk is high-level, but a general knowledge of tabling is assumed
XSB’s tabling methodology as discussed in this talk was developed and implemented by (in alphabetical order)Luis de Castro, Baoqiu Cui, Steve Dawson,Ernie Johnson, Juliana Freire, Michael Kifer, Rui F. Marques, C.R. Ramakrishnan, I.V. Ramakrishnan,Prasad Rao, Konstantinos Sagonas, Diptikalyan Saha,Terrance Swift,David S. Warren
What environmental issues are important? It depends on what a future tabling system might look like
Non-monotonic Deductive databases reasoning Logic programming * Most work in tabling has been from the LP perspective (including enviroment support) * Some work has been done to support NMR with residual programs and XASP * Less has been done to support deductive database features
Two Premises • 1) Deductive databases are not dead (they’re just asleep) • Logic programming can be a basis for (mostly) in-memory DDBs using • -- Tabling which can help with querying, given analysis support for optimization • -- Multi-threading as in SWI, Ciao, YAP and XSB • -- Adaptive indexing as in YAP [CSL07] (or perhaps even good analysis-based indexing with XSB) • For deductive database applications, queries may rely less user programming and more compiler analysis and transformations
Two Premises • 2) The well-founded semantics is important for knowledge representation • * Some cognitive scientists model human cognition as a logic program under a 3-valued semantics (SvL08) • * Other advantages of WFS • -- Relevancy • -- Cumulativity • -- Polynomial complexity (in practice usually linear) • -- ... and it can be used as a step toward a (partial) stable model • * But like stable models, WFS may be difficult for a programmer to understand
10 Topics in Environments for Tabled Logic Progams What topics might an ideal compiler address to support all the functionality we’d like? #1 Managing table space #2 Updating tables when dynamic code changes #3 Automatically deciding which predicates to table #4 Deciding what kind of tabling to use #5 Optimizing transformations #6 Choosing a scheduling strategy #7 Debugging #8 Support for transformation-based semantic extensions #9 Better ASP Integration #10 Making answer subsumption usable by real programmers
#1 Managing table space • * Tables can consume a lot of space, and so must be reclaimed • -- do we want tables to amortize time for multiple queries? • -- do we want to reclaim tables after each top-level query? • -- do we want to reclaim tables within a query? • * Reclamation can be for • -- All tables (XSB, YAP, BProlog) • -- All tables for a given predicate (XSB,YAP, BProlog) • -- All tables for predicates in a given module (XSB) • -- All thread-private/thread-shared tables (XSB) • -- A single table for a given call (XSB) • * Reclamation can be automatic -- based on a LRU algorithm [Roc](YAP) • * Safe reclamation requires garbage collection for tables. • -- XSB has full garbage collection for thread-private tables • -- Space for thread-shared tables can only be reclaimed when there is a single active thread
#1 Managing table space * Space reclamation becomes more complicated when supporting well-founded reducts * XSB maintains the well-founded reduct in its table, called the residual progam e.g in the table p(X) depends on the table b(X) through the answer b(2) which is undefined under WFS * Abolishing b(2) but not p(X) changes the residual program with semantic implications when integrating with ASP solvers, not to mention core dumps. * XSB has a flag indicating whether table abolishes are to be cascaded
#1 Managing table space • * These approaches are already sophisticated, but how should space be reclaimed? • -- Ignore for a minute reclaiming tables that depend on dynamic clauses that have changed • * A combined approach that • -- Automatically reclaims • -- Reclaims safely with no possibility of backtracking into a reclaimed table • -- Takes into account inter-table dependencies for WFS • * Allow users (or a compiler) to specify the level of persistence of a table • -- Never reclaimable • -- Automatically reclaimable at system threshold • -- Reclaimable at end of query • -- Reclaimable after last use within a query (determined by analysis)
#2 Updating tables when dynamic code changes • * If a table depends on dynamic clauses, the table should be updated when those clauses are altered • * In XSB this works for definite programs under call variance [Sah06], but the logical view of updates is not supported. • * Multiple declarations are required • -- a tabled predicate must be declared as incremental as do the relevant dynamic predicates. • -- special assert/retract predicates must be used. incr_assert(p(a)) asserts p(a) and updates the incremental tables that depend on p/1. Similarly, incr_assert_inval(p(a)) invalidates tables that depend on p/1 • * Analysis work is needed to determine whether a predicate should be tabled as incremental (easy), and whether the table should be incrementally updated or invalidated (hard).
Global Analysis for Tabling • * The most common way to table is to choose predicates to table by hand • * This may be difficult to do in certain cases (e.g. generated code) • * As tabling systems mature, different tables may have different attributes, increasing the complexity • -- Call subsumption vs. call variance (XSB) • -- Incremental vs. non-incremental (XSB) • -- Thread-private vs. thread-shared (XSB) • -- Tabling with scheduling strategies (YAP)
Example: A Fragment of the OWL Wine Ontology <owl:Class rdf:ID="PinotBlanc"> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#hasColor" /> <owl:hasValue rdf:resource="#White" /> </owl:Restriction> </rdfs:subClassOf> <owl:intersectionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Wine" /> <owl:Restriction> <owl:onProperty rdf:resource="#madeFromGrape" /> <owl:hasValue rdf:resource="#PinotBlancGrape" /> </owl:Restriction> <owl:Restriction> <owl:onProperty rdf:resource="#madeFromGrape" /> <owl:maxCardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:maxCardinality> </owl:Restriction> </owl:intersectionOf> </owl:Class>
The ontology is translated by KAON2 to a definite program with about 1000 clauses pinotblanc(X) :- q24(X).pinotblanc(X) :- pinotblanc(Y),kaon2equal(X, Y).pinnotblanc(X) :-wine(X),madefromgrape(X, Y),ot____nom21(Y). madefromgrape(Y, X) :- madeintowine(X, Y).madefromgrape(X, X) :- riesling(X),kaon2namedobjects(X).madefromgrape(X, X) :- wine(X),kaon2namedobjects(X).% 18 others wine(X) :- q14(X).wine(X) :- texaswine(X).% 24 others wine(X) :-q24(X). % 31 others q24(X) :-pinotblanc(X).q24(X) :- muscadet(X).q24(X) :- q24(Y),kaon2equal(X, Y).
Global Analysis for Tabling • Regardless of whether this program can be optimized, it is highly recursive • pinotblanc(yellowTail) depends on • pinotblanc(X) depends on wine(X) • and on • wine(yellowTail) • Nearly every concept depends on nearly every other concept (more or less) • Each predicate is called with multiple instantiations
#3 Automatically deciding which predicates to table • * The autotable declaration can be used to automatically select predicates to table (XSB). This declaration has proven useful for certain problems • -- All loops in the predicate dependency graph are broken • -- No effort is made to determine a minimal set of predicates to table (a previous version did, but it was to expensive) • -- No effort is made to distinguish structural recursion (e.g. append/3) from datalog recursion • * Analysis routines are needed to distinguish structural from datalog recursion, and then derive a good approximation of a minimal set of predicates to table
#4 Deciding what kind of tabling to use Call Variance XSB, YAP, BProlog Call Subsumption XSB Global Tables YAP
#4 Deciding what kind of tabling to use • Call subsumption vs. Call Variance • Call subsumption [JRR09,Swi09] -- e.g. the goal p(a,Y) can use the table for the goal p(X,Y) -- can greatly increase efficiency for some programs by sharing more computations • Call subsumption accrues about a 25% overhead if not needed • Call subsumption may not be desired when tabling a meta interpreter • Call subsumption with term depth abstraction can ensure that any program with a finite model terminates • -- e.g. p(X):- p(s(X)) can terminate if abstraction occurs at a depth of, say 10. Then a subgoal such as • p(s(s(s(s(s(s(s(s(s(s(X)))))))))))is abstracted to • p(s(s(s(s(s(s(s(s(s(X)))))))))) • * Within XSB, sharing is only performed if the subsumed call occurs after the subsuming call (p(a,Y) after p(X,Y))
#4 Deciding what kind of tabling to use Global Tables vs Call Variance * Similar to call subsumption, YAP’s global tables [CR09] allow answers to be shared between different calls * Unlike call subsumption, global tables can share answer information for two calls that unify even if neither subsumes the other (e.g. p(a,X) and p(X,b)) * On the other hand, unlike call subsumption, computation is not shared between two calls with global tables * However, global tables accrue a time cost over call variance, as well as a space cost if they are not used.
#4 Deciding what kind of tabling to use * So in addition to deciding what predicates to table, our perfect compiler decides between call variance, call subsumption and global tables (and BDDs and other data structures) * Some work has been done on analyzing when to use call subsumption [RRR96], but it has not been put into a compiler
#5 Optimizing Transformations • Let’s say you have a fully connected graph of N nodes and you want to find all nodes reachable from a given node a (e.g. your goal is ?- p(a,Y)). You can use: • -- left recursion: path(X,Y):- path(X,Z),edge(Z,Y) which creates one table with N-1 nodes and returns N-1 answers • -- right recursion: path(X,Y):- edge(X,Z), path(Z,Y) which • -- creates N tables each with N-1 nodes and returns (N-1)2 answers under call variance • -- would create 1 table under call subsumption if ?- p(X,Y) were used, but still would return (N-1)2 answers • -- double recursion: path(X,Y):- path(X,Z), path(Z,Y) which returns N(N-1) answers under call variance
#5 Optimizing Transformations • * So its usually best to transform into left recursion (if the bindings can be maintained) • * However, not all recursion can be transformed into left recursion without losing bindings • sg(X,X). • sg(X,Y):- p(X,Z),sg(Z,Z1),p(Z1,Y) • * When is it useful to transform recursion into left-recursion? Does call subsumption affect the decision of when to transform? • sg(X,X). • sg(X,Y):- sg(Z,Z1),p(X,Z),p(Z1,Y)
Example: compiling away recursion [NAU89] buys(X,Y):- likes(X,Y). buys(X,Y):- trendy(X),buys(Z,Y) is equivalent to buys(X,Y):- likes(X,Y). buys(X,Y):- trendy(X),likes(Z,Y) but buys(X,Y):- likes(X,Y). buys(X,Y):- knows(X,Z),buys(Z,Y) is not equivalent to buys(X,Y):- likes(X,Y). buys(X,Y):- knows(X,Z),likes(Z,Y)
#5 Optimizing Transformations • * How do these tabling optimizations recall optimizations for • -- Prolog (e.g. transformations to get rid of existential variables [PP95]) • -- Deductive databases (e.g. separable recursion [Nau88]) • -- Grammars (e.g. the complexity reduction of Earley Grammars when they are translated into Chomsky Normal Form [Ear70]) • -- XSB has a declaration for this transformation, called suppl_table, but no support on whether it should be used.
#6 Choosing a scheduling strategy • * Tabled evaluations can differ in their scheduling strategy [FSW98] -- the decision about when answers are to be returned to (possibly suspended) subgoals • * The two most popular strategies are • -- local evaluation which is efficient in terms of stack usage; and efficient for returning all answers. It can also reduce the complexity when answer subsumption is used (e.g. for finding shortest paths) • -- batched evaluation which is efficient for finding the first answer for a subgoal, and for parallelizing tabled evaluations • * The scheduling strategy can be dynamically selected in YAP [RSC05], but is only a configuration option in XSB • * Altering a scheduling strategy may be a factor in extracting certain types of table parallelism that are based on threads communicating through tables (XSB) or in a full or-parallel tabling system (OptYAP)
#7 Debugging • Tabling may need to repeatedly suspend computation of a tabled subgoal S and resume the computation when further answers are derived for S • This change in search strategy can break the model of the 4-port Prolog debugger -- i.e. “skip” may have no meaning • -- at the least, a trace-based debugger becomes more complex for a user • XSB has a justification package [PGD+04]; similar packages have been created for ASP systems [PSE09] • -- after computation has terminated, produce a representation of the search tree that justifies the result
#7 Debugging • Justification-based debugging works as follows: • 1) Assertions are made of the form justify_pred(Pred) • 2) The annotated program is transformed • 2) A call is made just_true(Goal,JustTrue) or just_false(Goal,JustFalse) • -- JustTrue unifies with a proof of Goal • -- JustFalse unifies with witnesses of failure for all clauses for Goal • -- If Goal is undefined, it will have both a true and a false justification • Currently, justification does not appear to be heavily used, due in part to the fact that it is not well integrated with XSB (a program has to be transformed then reloaded to run the justifier)
#7 Debugging • Thus, to support debugging of tabled programs either • --justification must be performed (via a meta-interpreter or pre-processor); or • -- some sort of trace-based debugging needs to be developed for tabling; or • -- some other approach must be developed • No solution is currently satisfactory, so debugging tabling remains an open issue, especially with non-stratified programs
#8 Support for transformation-based language extensions • * Consider well-founded atom-based preferences. A preference clause has the form prefer(A1,A2):- Body which means that if Body is true, a derivation of A2 will be true only if a derivation of A1 is false (if A1 is undefined, A2 will be also) • * This has been used for • -- Amalgamating psychiatric rules [GST+00] • -- Disambiguating grammars [GJM95,CS02] • -- Adding local policies to workflows • * Implemented though a (fairly) simple transformation prefer(A1,A2) adds a literal tnot(overridden(A2)) to clauses that may derive A2
#8 Support for transformation-based extensions • * Work is being done to integrate preferences into XSB’s • compiler, however, this gives rise to questions • -- how to debug in a cognitively meaningful manner • -- specialization to ensure that overridden/1 literals are not added to bodies unnecessarily • * Explicit negation under WFS [ADP94] is another transformation-based extension • * In general a compiler must be architected so that the analysis and optimizations mentioned above happen “at the right time” • -- e.g. recursion separation, determination of what to table, etc. should happen after the semantic transformations • -- optimizations specific to preferences, explicit negation, etc. may determine the exact form of the transformation
#8 Support for transformation-based extensions • * In general a compiler must be architected so that the analysis and optimizations mentioned above happen “at the right time” • -- e.g. recursion separation, determination of what to table, etc. should happen after the semantic transformations • -- optimizations specific to preferences, explicit negation, etc. may determine the exact form of the transformation • * How can we ensure that a sequence of transformations is semantically valid? • * How can we ensure that a sequence of transformations is operationally valid (e.g. optimizations work, debugging meaningful)?
#9 Better ASP integration • The XSB distribution now includes Smodels, and the XASP package provides a way to evaluate the partial stable models of a residual program --(well-founded reduct of the portion of a program relevant to a query). • -- init_smodels(Query) sends the residual of a query to Smodels • -- in_all_stable_models(Goal)andpstable_model(Goal,Model) provide a means to query Smodels • -- each XSB thread can have its own instance of Smodels • -- XASP has been used for agent programs [PL07,LMP08, PA09] and has been extended to handle Plog-style probabilities [PR09]
#9 Better ASP integration • * Using XASP, query evaluation may perform grounding, and may identify relevent portions of the program for solving • -- e.g. for inferences from ontologies, not all of the A-box or T-box isa hierarchy may need to be materialized • * At the same time, a programmer has to have a priori knowledge • -- what parts of the program require a full ASP semantics • -- that the partial stable model obtained from the query reduct is semantically valid (e.g. can be extended into one or more total stable models) • * How can modules (or some other constructs) be made to support the notion that various queries have useful partial stable models? • * Currently, XSB does not evaluate cardinality or weight constraints, and so rules with such constraints must be passed to Smodels separately
#10 Making answer subsumption usable by real programmers * Suppose we want to annotate an answer with an explicit truth value * If p(a):true and p(a):false are both derived, then the table should only contain p(a):top -- which wasn’t directly derived * This can be performed through a mechanism called answer subsumption (for want of a better name)
#10 Making answer subsumption usable • * Consider a model of quantitative degrees of belief [van86]. An annotated atom A:[ET,EF] is an atom A is annotated with • -- ET, a number between 0 and 1 indicating a measure of evidence that A is true; and • -- EF, a number between 0 and 1 indicating that A is false. • -- [ET EF] join [ET EF] = [max(ET ET),max( EF EF)] • * Resolution for these annotated literals can be defined. The main idea is:- A goal A:[ET,EF] is true in an interpretation J of a program P if there is are rulesAI:[EIT,EIF] :- BodyIin P such that each AIunifies with A, each BodyIis true in J, [E’T,E’F] = join [EIT,EIF], E'T >= ET and FT =< F'T
#10 Making answer subsumption usable • * Generalizing this approach to upper semi-lattices, you get Generalized Annotated Programs (GAPs) [KS92] that can model many kinds of quantitative, paraconsistent, and temporal reasoning. • * From our perspective, GAPs can be implemented using answer subsumption [Swi99]. To illustrate on ground programs, when an answer A:[annew] is derived:-- Add A:[annew] if the table does not have an answer with substitution A; or-- Add A:[anjoin] -- the join of annew and anold where A:[anold] is the answer for A in the table. • * This formalism is similar to others, such as Residuated Lattice Programs {DP01] • * XSB has a meta-interpreter for stratified GAP’s in its gap library
#10 Making answer subsumption usable • * Answer subsumption essentially depends on a destructive tabling operation -- it may be fairly easily implementable in YAP or B Prolog • * Answer subsumption can also handle some lattices used for program analysis, so that perhaps tabling can be used for some of the previous analysis techniques • * Constraints can be tabled -- so that the join of different constraints for an answers can be maintained • -- This works in theory, but needs more testing before I can recommend people use it. • * But right now answer subsumption in XSB is ugly and hard to use
#10 Making answer subsumption usable • reachable(InConf,NewConf):- filterPOA(reachable(InConf),Conf,gte_omega,omega_abstr,call_abstr), hasTransition(Conf,NewConf).reachable(InConf,NewConf):- hasTransition(InConf,NewConf) • * filterPOA/5 is somewhat arcane. • -- 1st argument: call (minus argument to be subsumed) • -- 2nd argument: argument to be subsumed • -- 3rd argument: comparison function (for subsumption) • -- 4th argument: abstraction function for answer • -- 5th argument: abstraction function to find relevant answers from subgoal • * This is the most complicated of the XSB answer subsumption predicates. Other predicates do not require the 4th and 5th arguments
Conclusions • Because tables require space and are based on a given state of the environment issues arise for • -- table management • -- automatic table updating • It should be possible to partially automate or assist choices of • -- what to table • -- what data structure to use • -- what search strategy to use • -- how to transform a program to one that is efficiently executed • Tabling can support more powerful semantics than Prolog alone leading to questions of how to • -- debug • -- implement transformation-based semantics and constructs in an easy-to-use, cognitively clear manner • -- unite a query-based system with stable model generation • -- allow users to easily exploit answer subsumptions ability to join the results of derivations and to abstract answers
References [ADP94] J. Alferes, C. V. Dam´asio, and L. M. Pereira. A top-down query evaluation for well-founded semantics with explicit negation. In European Conference on Artificial Intelligence, pages 140–144. Morgan Kaufmann, 1994. [CR09] J. Costa and R. Rocha. One table fits all. In Practical Applications of Declarative Languages, pages 195–208, 2009. [CS02] B. Cui and T. Swift. Preference logic programs: Fixed-point semantics and application to data standardization. Artificial Intelligence, 138:117–147, 2002. [CSL07] V. Santos Costa, K. Sagonas, and R. Lopes. Demand-driven indexing of prolog clauses. In Proceedings of the 23rd International Conference Logic Programming, pages 395–409. Springer, 2007. [DP01] C. V. Dam´asio and L. Pereira. Monotonic and residuated logic programs. In ECSQARU, pages 748–759, 2001. [Ear70] Jay Earley. An efficient context-free parsing algorithm. Communications of the ACM, 13(2):94–102, 1970. [FSW98] J. Freire, T. Swift, and D. S. Warren. Beyond depth-first: Improving tabled logic programs through alternative scheduling strategies. Journal of Functional and Logic Programming, 1998(3), 1998. [GJM95] K. Govindarajan, B. Jayaraman, and S. Mantha. Preference logic programming. In International Conference on Logic Programming, pages 731–746, 1995. [GST+00] J. Gartner, T. Swift, A. Tien, L. M. Pereira, and C. Dam´asio. Psychiatric diagnosis from the viewpoint of computational logic. In 8th International Workshop on Non-Monotonic Reasoning, 2000. [JRRR99] E. Johnson, C.R. Ramakrishnan, I.V. Ramakrishnan, and P. Rao. A space-efficient engine for subsumption based tabled evaluation of logic programs. In 4th International Symposium on Functional and Logic Programming, 1999.
[KS92] M. Kifer and V. S. Subrahmanian. Theory of generalized annotated logic programming and its applications. Journal of Logic Programming, 12(4):335–368, 1992. [LMP08] P. Dell’Acqua L. M. Pereira, Gonalo Lopes. On preferring and inspecting abductive models. In Practical Applications of Declarative Languages, 2008. [Nau88] J. Naughton. Compiling separable recursions. In SIGMOD, pages 312–319, 1988. [Nau89] J. Naughton. Data independent recursion in deductive databases. J. Computer and System Sciences, 38:259-289, 1989. [PA09] L. M. Pereira and H. T. Anh. Evolution prospection. In KES-IDT, 2009. [PGD+04] G. Pemmasani, H. Guo, Y. Dong, C. R. Ramakrishnan, and I. V. Ramakrishnan. Online justification for tabled logic programs. In FLOPS, 2004. [PL07] L. M. Pereira and G Lopes. Prospective logic agents. In EPIA, pages 73–86, 2007. [PP95] M. Proetti and A. Petterossi. Unfolding - definition - folding, in that order, for avoiding unecessary variables in logic programs. Theoretical Computer Science, 142:89–124, 1995. [PR09] L. M. Pereira and Carroline Ramli. Modelling probabilistic causation in decision making. In KES-IDT, 2009. [PSE09] E. Pontelli, T.C. Son, and O. Elkatib. Justifications for logic programs under the answer set semantics. Theory and Practice of Logic Programming, 9:1–56, 2009.
[Roc07] R. Rocha. On Improving the Efficiency and Robustness of Table Storage Mechanisms for Tabled Evaluation. In Practical Applications of Declarative Languages, pages 155–169, 2007. [RRR96] P. Rao, C.R. Ramakrishnan, and I.V. Ramakrishnan. A thread in time saves tabling time. In Joint International Conference/Symposium on Logic Programming, 1996. [RSC05] R. Rocha, F. Silva, and V. Santos Costa. Dynamic mixed-strategy evaluation of tabled logic programs. In International Conference on Logic Programming, page 250264, 2005. [Sah06] D. Saha. Incremental Evaluation of Tabled Logic Programs. PhD thesis, SUNY Stony Brook, 2006. [SvL08] K. Stenning and M. van Lambalgen. Human Reasoning and Cognitive Science. MIT Press, 2008. [Swi99] T. Swift. Tabling for non-monotonic programming. Annals of Mathematics and Artificial Intelligence, 25(3-4):201–240, 1999. [Swi09] T. Swift. An engine for efficiently computing (sub-)models. In International Conference on Logic Programming, 2009. [van86] M. van Emden. Quantitative deduction and its fixpoint theory. Journal of Logic Programming, 4:37–53, 1986.
#8 Treating the table as (readable) data • * Given a term T, users can inspect tabled subgoals that are variants of T, subsume T, are subsumed by T, or unify with T; users may want to determine whether these tables are complete, and associate answers with the subgoal • * Given a set of subgoals S, users may wish to find the residual subprogram reachable from S, as well as its strongly connected components • -- Reachability can be from heads to body literals, or • -- Reachability can be from body literals to heads • * Mostly handled in XSB, but not always efficiently or cleanly (e.g. attributed variables in tables are not always passed back by table inspection routines • * Users may wish to ground a residual program (no interface yet)