1 / 17

Towards Model-lite Planning A Proposal For Learning & Planning with Incomplete Domain Models

Towards Model-lite Planning A Proposal For Learning & Planning with Incomplete Domain Models. Sungwook Yoon Subbarao Kambhampati. Supported by DARPA Integrated Learning Program. A Planning Problem. Suppose you have a super fast planner and a target application.

kyoko
Download Presentation

Towards Model-lite Planning A Proposal For Learning & Planning with Incomplete Domain Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Model-lite PlanningA Proposal For Learning & Planning with Incomplete Domain Models Sungwook Yoon Subbarao Kambhampati Supported by DARPA Integrated Learning Program

  2. A Planning Problem Suppose you have a super fast planner and a target application. What is the first problem you have to solve? Is it a problem from the application? Domain Engineering is hard  Model-lite Planning Towards Model-lite Planning - Sungwook Yoon

  3. Snapshot of the talk • This is a proposal. We formulate learning and planning problems and solution methods for them. We tested our idea on some problems. But the verification is still an undergoing process • We propose • Representation for model-lite planning • probabilistic logic, incompleteness is quantified • Explicit consideration of domain invariant • Learning of the domain model • Update of the probability and finding of the new axioms • Planning with the model • Deterministic planning domain needs probabilistic planning • Most plausible plan that respects the current domain model Towards Model-lite Planning - Sungwook Yoon

  4. Representation • Precondition Axiom: pAi, A → prei • Uncertainty is quantified as a probability • Effect Axiom: eAi, A → effecti • Facilitates learning Towards Model-lite Planning - Sungwook Yoon

  5. Domain Model - Blocksworld Precondition Axiom: Relates Actions with Current state facts Effect Axiom: Relates Actions with Next state facts 0.9, Pickup (x) -> armempty() 1, Pickup (x) -> clear(x) 1, Pickup (x) -> ontable(x) 0.8, Pickup (x) –> holding(x) 0.8, Pickup (x) -> not armempty() 0.8, Pickup (x) -> not ontable(x) Towards Model-lite Planning - Sungwook Yoon

  6. Representation • 1, holding(x) -> not armempty() • 1, holding(x) -> not ontable(x) • 0.8, Pickup (x) –> holding(x) • 0.8, Pickup (x) -> not armempty() • 0.8, Pickup (x) -> not ontable(x) Static Property: Relates Facts in a State Effect Axiom: Relates Actions with Next state facts • One modeling problem • Conjunction of the effect have different semantics, if the probability of each effect is independently specified • Add hidden variable, O , (e, A → O), then add deterministic axioms for each effect, (1,O → eff1), (1,O → eff2), … • We can alleviate this problem also with explicit domain invariant property • Writing explicit domain invariant property is easier than writing initial state generator and a set of operators that respects such property Towards Model-lite Planning - Sungwook Yoon

  7. Learning the domain model • Given a trajectory of states and actions, S1,A1,S2,A2, … , Sn,An,Sn+1 • We can learn precondition axioms from (S1,A1), (S2,A2), …, (Sn,An) • We can learn effect axioms from (A1,S2), (A2,S3), … , (An,Sn+1) • We can learn domain invariant properties from each state (S1), … , (Sn+1) • The weights (probabilities) of the axioms can be updated with simple perceptron update • There are readily available package for weighted logic learning • Alchemy (MLN) • Problog • Structure learning • Alchemy provides structure learning too • We can also enumerate all the possible axioms (very costly for planning) Towards Model-lite Planning - Sungwook Yoon

  8. Model-lite planning Probabilistic Planning • As stated before, with incomplete domain knowledge, a deterministic planning domain should be treated as a probabilistic domain • The resulting plan should be maximally consistent with the current domain model • We develop a planning technique for this purpose • A plan that is maximally plausible, given the probabilistic axioms, initial state and goal • MPE solution to a Bayes Net problem • Build on plangraph Towards Model-lite Planning - Sungwook Yoon

  9. Probabilistic Plangraph A Domain Invariant Property Can be asserted too A B B clear_a pickup_a clear_a pickup_a clear_a clear_b pickup_b clear_b pickup_b clear_b armempty noop_clear_a armempty stack_a_b armempty noop_clear_b ontable_a ontable_a stack_b_a ontable_a noop_armempty noop_clear_a ontable_b ontable_b ontable_b noop_ontable_a noop_clear_b holding_a holding_a noop_ontable_b noop_armempty holding_b holding_b noop_ontable_a on_a_b noop_ontable_b on_b_a 0.8 noop_holding_a noop_holding_b How do we generate a weighted clause? 0.95, pickup_b’ v holding_b 0.8 Red lines indicate Mutexes Towards Model-lite Planning - Sungwook Yoon

  10. Can we view the probabilistic plangraph as Bayes net? A Domain Invariant Property Can be asserted too, 0.9 A B B 0.5 clear_a pickup_a clear_a pickup_a clear_a clear_b pickup_b clear_b pickup_b clear_b armempty noop_clear_a armempty stack_a_b armempty noop_clear_b ontable_a ontable_a stack_b_a ontable_a noop_armempty noop_clear_a ontable_b ontable_b ontable_b noop_ontable_a noop_clear_b holding_a holding_a noop_ontable_b noop_armempty holding_b holding_b noop_ontable_a on_a_b noop_ontable_b on_b_a 0.8 noop_holding_a noop_holding_b Evidence Variables 0.8 How we find a solution? MPE (most probabilistic explanation) There are some solvers out there Towards Model-lite Planning - Sungwook Yoon

  11. MPE as Maxsat Intuitive explanation Violating the clause is easier for High probability instances Thus the MaxSat Problem Gives you the highest probability instantiations Weighted Clauses -log0.7 -A v –B -log0.3 A V –B -log0.2 –A v B -log0.8 A v B A->B, T T 1, T F 0, InfinityWeight for –A v B, (complies with our intuitive understanding) There has been a work by James D. Park, AAAI 2002 Set –log(P) as the weight of the clauses Towards Model-lite Planning - Sungwook Yoon

  12. Probabilistic Plangraph to MaxSat A Domain Invariant Property Can be asserted too, -log0.9 A B B -log0.5 clear_a pickup_a clear_a pickup_a clear_a clear_b pickup_b clear_b pickup_b clear_b armempty noop_clear_a armempty stack_a_b armempty noop_clear_b ontable_a ontable_a stack_b_a ontable_a noop_armempty noop_clear_a ontable_b ontable_b ontable_b noop_ontable_a noop_clear_b holding_a holding_a noop_ontable_b noop_armempty holding_b holding_b noop_ontable_a on_a_b noop_ontable_b on_b_a -log0.8 noop_holding_a noop_holding_b Evidence Variables -log0.8 For each probabilistic weight, we give –log(1-p)! That’s it. Towards Model-lite Planning - Sungwook Yoon

  13. Exploding Blocksworld Towards Model-lite Planning - Sungwook Yoon

  14. Current Status (ongoing) • Learning test • Generated Blocksworld Random Wandering Data and feed them to Alchemy with correct and incorrect axioms • Alchemy found higher weight on the correct axioms and lower weight on the incorrect axioms • Planning test – Tested on probabilistic planning problems • Hand tested on a couple of instances of Slippery Gripper Domain • Hand encoded the clauses and assigned the weight • Put the resulting clauses to MaxSat solve • Got desired results • On Exploding Blocksworld • Implemented generic MaxSat encoder for probabilistic planning problems • Tested on a couple of problems from Exploding Blocksworld • Finds desired output frequently (not always) Towards Model-lite Planning - Sungwook Yoon

  15. Summary • We can learn precondition axioms and effect axioms separately. • A -> Prec, A->Effect • Facilitates the learning • Domain axiom or Invariant Property can be, provided, learned and used explicitly • It is better for domain modeler • For planning, we can apply probabilistic plangraph approach • We proposed using MaxSat to solve probabilistic planning problems • Interesting parallel to deterministic planning to SAT Towards Model-lite Planning - Sungwook Yoon

  16. Domain Learning – Related Work • Logical Filtering (Chang & Eyal, ICAPS’06) • Update belief state and domain transition model • Experiments involved planning • Probabilistic operator learning (Zettlemoyer, Pasula and Kaelbling, AAAI’05) • Experiments involved planning • ARMS (Yang, Wu and Jiang, ICAPS ‘05) • No observation besides initial state and goal Towards Model-lite Planning - Sungwook Yoon

  17. Probabilistic Planning in Plangraph – Related Work • Pgraphplan, Paragraph • Both search plans in the graphplan framework. • pGraphplan searches for a consistent plan that maximizes the goal-reaching probability • Forward probability propagation • Paragraph searches for a plan that minimizes the cost to reach the goal • Backward plan search Towards Model-lite Planning - Sungwook Yoon

More Related