1 / 26

Learning control knowledge and case-based planning

Learning control knowledge and case-based planning. Jim Blythe, with additional slides from presentations by Manuela Veloso. Motivation. Planning is hard. PSpace-hard. BUT.. this is a worst-case result In many domains there may exist efficient strategies for planning

tatkins
Download Presentation

Learning control knowledge and case-based planning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning control knowledgeand case-based planning Jim Blythe, with additional slides from presentations by Manuela Veloso

  2. Motivation • Planning is hard. PSpace-hard. • BUT.. this is a worst-case result • In many domains there may exist efficient strategies for planning • May be able to derive them automatically from experience

  3. Controlling search • Every planning algorithm does search • Given a choice point, if makes incorrect choice, needs to backtrack and try other choices • If we can make the right choice the first time…

  4. Prodigy • Explicit search control rules can apply to any decision point • Many different learning approaches have been implemented • Relatively old planning approach

  5. Learning methods in Prodigy

  6. Overview of Prodigy planning algorithm

  7. Prodigy algorithm

  8. Prodigy algorithm, part II

  9. Decision points in Prodigy

  10. Example domain: process planning

  11. Example control rules in Prodigy

  12. Review of explanation-based learning MV Inputs: • Target concept definition • Training example • Domain theory • Operationality criterion Output: • Generalization of the training example that is • Sufficient to describe the target concept, and • Satisfies the operationality criterion

  13. The safe-to-stack example MV Input: • Target concept: safe-to-stack(x,y) • Training example: on(obj1, obj2) isa(obj1, box) isa(obj2, endtable) color(obj1, red) color(obj2, blue) volume(obj1, 1) density(obj1, 0.1), …

  14. The safe-to-stack example, cont. MV Input: Domain theory: • Not(fragile(y)) or lighter(x, y) => safe-to-stack(x,y) • Volume(x,v) and density(x,d) => weight(x, v*d) • Weight(x1, w1) and weight(x2, w2) and less(w1, w2) => lighter(x1, x2) • Isa(x, endtable) => weight(x, 5) • Less(0.1, 5), … Operationality criterion: Learned description should use terms that describe objects directly, or are ‘easy’ to evaluate, e.g ‘less’

  15. The safe-to-stack example MV • Explain why obj1 is safe-to-stack on obj2 • Construct a proof • Do goal regression: regress target concept through the proof structure • Proof isolates relevant features

  16. Generating operational knowledge MV • Generalize proof • Sometimes, simply replace constants by variables • Prove that all identified relevant features are necessary in general • Output: volume(x,v1) and density(x,d1) and isa(y, endtable) and less(v1*d1, 5) => safe-to-stack(x,y)

  17. Using EBL to improve plan quality • Given: planning domain, evaluation function planner’s plan, a better plan • Learn: control knowledge to produce the better plan • Explanation used: explain why the alternative plan is better • Target concept: control rules that make choices based on the planner state and meta-state

  18. EBL in Prodigy • Used by Minton (88) to improve efficiency of planning • Version used in Quality (95) to improve quality of solution

  19. Architecture of Quality system

  20. Explaining better plans recursively

  21. Explaining better plans recursively:target concept: shared subgoal

  22. Example from process planning

  23. Learned rules

  24. Discussion • EBL is always correct, but Quality isn’t – only learns why plan B is better than plan A • No guarantee of optimality • Linear additive evaluation function – how well does this model metrics we care about? • Generality of control rules

More Related