170 likes | 332 Views
Towards Model-lite Planning A Proposal For Learning & Planning with Incomplete Domain Models. Sungwook Yoon Subbarao Kambhampati. Supported by DARPA Integrated Learning Program. A Planning Problem. Suppose you have a super fast planner and a target application.
E N D
Towards Model-lite PlanningA Proposal For Learning & Planning with Incomplete Domain Models Sungwook Yoon Subbarao Kambhampati Supported by DARPA Integrated Learning Program
A Planning Problem Suppose you have a super fast planner and a target application. What is the first problem you have to solve? Is it a problem from the application? Domain Engineering is hard Model-lite Planning Towards Model-lite Planning - Sungwook Yoon
Snapshot of the talk • This is a proposal. We formulate learning and planning problems and solution methods for them. We tested our idea on some problems. But the verification is still an undergoing process • We propose • Representation for model-lite planning • probabilistic logic, incompleteness is quantified • Explicit consideration of domain invariant • Learning of the domain model • Update of the probability and finding of the new axioms • Planning with the model • Deterministic planning domain needs probabilistic planning • Most plausible plan that respects the current domain model Towards Model-lite Planning - Sungwook Yoon
Representation • Precondition Axiom: pAi, A → prei • Uncertainty is quantified as a probability • Effect Axiom: eAi, A → effecti • Facilitates learning Towards Model-lite Planning - Sungwook Yoon
Domain Model - Blocksworld Precondition Axiom: Relates Actions with Current state facts Effect Axiom: Relates Actions with Next state facts 0.9, Pickup (x) -> armempty() 1, Pickup (x) -> clear(x) 1, Pickup (x) -> ontable(x) 0.8, Pickup (x) –> holding(x) 0.8, Pickup (x) -> not armempty() 0.8, Pickup (x) -> not ontable(x) Towards Model-lite Planning - Sungwook Yoon
Representation • 1, holding(x) -> not armempty() • 1, holding(x) -> not ontable(x) • 0.8, Pickup (x) –> holding(x) • 0.8, Pickup (x) -> not armempty() • 0.8, Pickup (x) -> not ontable(x) Static Property: Relates Facts in a State Effect Axiom: Relates Actions with Next state facts • One modeling problem • Conjunction of the effect have different semantics, if the probability of each effect is independently specified • Add hidden variable, O , (e, A → O), then add deterministic axioms for each effect, (1,O → eff1), (1,O → eff2), … • We can alleviate this problem also with explicit domain invariant property • Writing explicit domain invariant property is easier than writing initial state generator and a set of operators that respects such property Towards Model-lite Planning - Sungwook Yoon
Learning the domain model • Given a trajectory of states and actions, S1,A1,S2,A2, … , Sn,An,Sn+1 • We can learn precondition axioms from (S1,A1), (S2,A2), …, (Sn,An) • We can learn effect axioms from (A1,S2), (A2,S3), … , (An,Sn+1) • We can learn domain invariant properties from each state (S1), … , (Sn+1) • The weights (probabilities) of the axioms can be updated with simple perceptron update • There are readily available package for weighted logic learning • Alchemy (MLN) • Problog • Structure learning • Alchemy provides structure learning too • We can also enumerate all the possible axioms (very costly for planning) Towards Model-lite Planning - Sungwook Yoon
Model-lite planning Probabilistic Planning • As stated before, with incomplete domain knowledge, a deterministic planning domain should be treated as a probabilistic domain • The resulting plan should be maximally consistent with the current domain model • We develop a planning technique for this purpose • A plan that is maximally plausible, given the probabilistic axioms, initial state and goal • MPE solution to a Bayes Net problem • Build on plangraph Towards Model-lite Planning - Sungwook Yoon
Probabilistic Plangraph A Domain Invariant Property Can be asserted too A B B clear_a pickup_a clear_a pickup_a clear_a clear_b pickup_b clear_b pickup_b clear_b armempty noop_clear_a armempty stack_a_b armempty noop_clear_b ontable_a ontable_a stack_b_a ontable_a noop_armempty noop_clear_a ontable_b ontable_b ontable_b noop_ontable_a noop_clear_b holding_a holding_a noop_ontable_b noop_armempty holding_b holding_b noop_ontable_a on_a_b noop_ontable_b on_b_a 0.8 noop_holding_a noop_holding_b How do we generate a weighted clause? 0.95, pickup_b’ v holding_b 0.8 Red lines indicate Mutexes Towards Model-lite Planning - Sungwook Yoon
Can we view the probabilistic plangraph as Bayes net? A Domain Invariant Property Can be asserted too, 0.9 A B B 0.5 clear_a pickup_a clear_a pickup_a clear_a clear_b pickup_b clear_b pickup_b clear_b armempty noop_clear_a armempty stack_a_b armempty noop_clear_b ontable_a ontable_a stack_b_a ontable_a noop_armempty noop_clear_a ontable_b ontable_b ontable_b noop_ontable_a noop_clear_b holding_a holding_a noop_ontable_b noop_armempty holding_b holding_b noop_ontable_a on_a_b noop_ontable_b on_b_a 0.8 noop_holding_a noop_holding_b Evidence Variables 0.8 How we find a solution? MPE (most probabilistic explanation) There are some solvers out there Towards Model-lite Planning - Sungwook Yoon
MPE as Maxsat Intuitive explanation Violating the clause is easier for High probability instances Thus the MaxSat Problem Gives you the highest probability instantiations Weighted Clauses -log0.7 -A v –B -log0.3 A V –B -log0.2 –A v B -log0.8 A v B A->B, T T 1, T F 0, InfinityWeight for –A v B, (complies with our intuitive understanding) There has been a work by James D. Park, AAAI 2002 Set –log(P) as the weight of the clauses Towards Model-lite Planning - Sungwook Yoon
Probabilistic Plangraph to MaxSat A Domain Invariant Property Can be asserted too, -log0.9 A B B -log0.5 clear_a pickup_a clear_a pickup_a clear_a clear_b pickup_b clear_b pickup_b clear_b armempty noop_clear_a armempty stack_a_b armempty noop_clear_b ontable_a ontable_a stack_b_a ontable_a noop_armempty noop_clear_a ontable_b ontable_b ontable_b noop_ontable_a noop_clear_b holding_a holding_a noop_ontable_b noop_armempty holding_b holding_b noop_ontable_a on_a_b noop_ontable_b on_b_a -log0.8 noop_holding_a noop_holding_b Evidence Variables -log0.8 For each probabilistic weight, we give –log(1-p)! That’s it. Towards Model-lite Planning - Sungwook Yoon
Exploding Blocksworld Towards Model-lite Planning - Sungwook Yoon
Current Status (ongoing) • Learning test • Generated Blocksworld Random Wandering Data and feed them to Alchemy with correct and incorrect axioms • Alchemy found higher weight on the correct axioms and lower weight on the incorrect axioms • Planning test – Tested on probabilistic planning problems • Hand tested on a couple of instances of Slippery Gripper Domain • Hand encoded the clauses and assigned the weight • Put the resulting clauses to MaxSat solve • Got desired results • On Exploding Blocksworld • Implemented generic MaxSat encoder for probabilistic planning problems • Tested on a couple of problems from Exploding Blocksworld • Finds desired output frequently (not always) Towards Model-lite Planning - Sungwook Yoon
Summary • We can learn precondition axioms and effect axioms separately. • A -> Prec, A->Effect • Facilitates the learning • Domain axiom or Invariant Property can be, provided, learned and used explicitly • It is better for domain modeler • For planning, we can apply probabilistic plangraph approach • We proposed using MaxSat to solve probabilistic planning problems • Interesting parallel to deterministic planning to SAT Towards Model-lite Planning - Sungwook Yoon
Domain Learning – Related Work • Logical Filtering (Chang & Eyal, ICAPS’06) • Update belief state and domain transition model • Experiments involved planning • Probabilistic operator learning (Zettlemoyer, Pasula and Kaelbling, AAAI’05) • Experiments involved planning • ARMS (Yang, Wu and Jiang, ICAPS ‘05) • No observation besides initial state and goal Towards Model-lite Planning - Sungwook Yoon
Probabilistic Planning in Plangraph – Related Work • Pgraphplan, Paragraph • Both search plans in the graphplan framework. • pGraphplan searches for a consistent plan that maximizes the goal-reaching probability • Forward probability propagation • Paragraph searches for a plan that minimizes the cost to reach the goal • Backward plan search Towards Model-lite Planning - Sungwook Yoon