550 likes | 1.21k Views
Flexible Planning. Ian Miguel AI Group Department of Computer Science University of York. AI Planning. Plan : Course of action to achieve pre-specified goals. Components of a planning problem: Plan objects. Initial state. Goal state. Operators. Example – Initial State. c 3. guard 1.
E N D
Flexible Planning Ian Miguel AI Group Department of Computer Science University of York
AI Planning • Plan: Course of action to achieve pre-specified goals. • Components of a planning problem: • Plan objects. • Initial state. • Goal state. • Operators.
Example – Initial State c3 guard1 r2 r3 pkg1 pkg2 • Goals: • Both packages to c4. • Guard to c3. • pkg1 is valuable. • pkg2 is not. m1 m2 c1 c2 c4 r1 • ci: Cities. • ri: Major roads. • mi: Mountainous roads.
Example - Operators c3 guard1 r2 r3 pkg1 pkg2 • Load-truck (guard present if pkg valuable). • Unload-truck (guard present if pkg valuable). • Guard-boards-vehicle. • Guard-leaves-vehicle. • Drive-truck (main roads only). m1 m2 c1 c2 c4 r1
guard1 pkg2 Example – Solution c3 r2 r3 pkg1 • Drive-truckc1 to c2 via r1. • Load-truckpkg2.Guard-boards-vehicletruck. m1 m2 c1 c2 c4 r1
guard1 pkg1 pkg2 Example - Solution c3 r2 r3 • Drive-truckc1 to c2 via r1. • Load-truckpkg2. m1 m2 c1 c2 c4 r1
guard1 pkg1 pkg2 Example - Solution c3 r2 r3 • Drive-truckc1 to c2 via r1. • Drive-truckc2 to c3 via r2. • Drive-truckc3 to c4 via r1. • Unload-truckpkg1.Unload-truckpkg2. m1 m2 c1 c2 c4 r1
pkg1 pkg2 Example - Solution guard1 c3 r2 r3 • Drive-truckc4 to c3 via r3. • Guard-leaves-vehicletruck. m1 m2 c1 c2 c4 r1
Solving Planning Problems • Many and varied approaches (see [Weld99]). • Focus here on Graphplan [Blum, Furst 97]. • Sound/complete. • Optimal in the number of actions/length of plan. • Constructs a planning graph, of which a valid plan is a sub-graph. • Easy to translate the search for a consistent sub-graph into a constraint satisfaction problem.
The Planning Graph • Divided into levels (equivalent to a step). • Each contains action and proposition nodes. • Level 0 contains propositions that capture the initial problem state. • Graph extended by instantiating operators whose preconditions met by propositions in previous level. Initial Conditions Actions1 Propositions1 Goals . . . . . . . . .
Mutual Exclusion Constraints • Record that a pair of actions or propositions cannot occur together in this level of a valid plan. • Restricts the set of sub-graphs that must be considered.
Mutual Exclusion Constraints • Exclusive actions: • Inconsistent effects: • Drive-truckc1 to c2 vs. Drive-truckc2 to c3. • 1st action: truck at c2. 2nd action: truck not at c2. • Interference: • Between an effect and a precondition. • Drive-truckc1 to c2 vs. Load-truck at c1. • Truck no longer at c1 after 1st action. • Competing needs: • Between preconditions. • Drive-truckc1 to c3 vs. Guard-boards-truck at c2. • Truck cannot be at c1 and c2 at the same time.
Mutual Exclusion Constraints • Exclusive propositions: • Negation: • Truck at c2 vs. ¬(Truck at c2). • Inconsistent Support. • Every way of supporting proposition a is exclusive of every way of supporting proposition b. a b
Finding a Valid Plan • Search for a consistent sub-graph connecting goal propositions and initial conditions. • If no such sub-graph, expand planning graph by one level and try again. • Approaches: • Translate into a propositional satisfiability (SAT) problem, use a specialised SAT solver. • Translate into a constraint satisfaction problem • Either one large problem, or connected sub-problems.
Specify allowed combinations of assignmentsof values to variables. The Constraint Satisfaction Problem • Given: • A finite set of variables. • Each variable has an associated finite domain of potential values. • A set of constraints over these variables. • Find: • A complete assignment of values to variables that satisfies all constraints.
The CSP Viewpoint • Variables: proposition nodes. • Domains: actions who assert these propositions as effects. • Each sub-problem is a small CSP. • Goals, and their domains form a first sub-problem. • Action pre-conditions specify new sub-problems… Goal Sub-problem
Memoisation • Goal: Solve as few sub-problems as possible. • Generate memosets from unsolvable sub-problems. • New sets of mutually exclusive propositions. • If a memoset matches the propositions of a sub-problem, prune the search branch immediately. Goal Sub-problem
Memoset Propagation • Memosets are propagated forwards. • If parent sub-problem has no child leading to a solution, propagated information used to create a memoset for it. Goal Sub-problem
A Weakness of Classical Planning • Inability to compromise: • All goals must be satisfied. • Applicability of an operator in a particular situation is Boolean. • Solution: • Introduce flexibility into planning. • Support compromise.
Flexible Planning Problems • Incorporate preferences into operators and goals. • Describe both as fuzzy relations. • Map from precondition combinations onto L, a totally ordered satisfaction scale. • Load-truck • Truck and (valuable) package in same place: l1 • Guard also present: l2 • Can relax some preconditions with associated damage to the satisfaction degree of resultant plan.
Truth Degree • From a scale, K. Endpoints indicate total truth/falsehood. • Attached to each proposition. • For example, can express how valuable a package is: • valuable pkgak. Equivalent to ¬(valuable pkga). • valuable pkgbk1, valuable pkgbk2 … • Operators and goals can identify ranges of acceptable truth degrees in their preconditions.
Flexible Plan Quality • Plan satisfaction degree is combination of all action/goal satisfaction degrees. • Via min. • Plan quality: • Length combined with plan satisfaction degree • With same satisfaction, shorter preferred. • Trade length of plan against number and severity of compromises made.
Flexible Example c3 L={l , l1, l2, lT} guard1 r2 r3 pkg1 pkg2 • Flexible goals: • Both packages to c4. • pkg2 is worth less, don’t deliver: l1 • Guard to c3. • Can also leave guard at c2 or c4: l2 m1 m2 c1 c2 c4 r1
Flexible Example L={l , l1, l2, lT} guard1 r2 r3 pkg1 pkg2 • Operators: • Drive-truck. • Avoid mountains or: l1 • Load/Unload-truck. • For valuable package, guard present or: l2 • Guard-boards/leaves-truck. m1 m2 c1 c2 c4 r1
Flexible Planning Graph • Actions annotated with their satisfaction degrees. • CSP variable domains expressed as unary fuzzy constraints. • Prefer to assign an element with lT, then lT-1, … Actions1 Propositions1 . . . l2 . . . l3 . . . l1
Finding Valid Flexible Plans: Flexible Graphplan • Same basic process. • Sub-problems are now fuzzy CSPs. • Overall search is branch and bound: • Find a plan with higher satisfaction degree than highest currently known. Goal Sub-problem
Short Compromise Plan c3 L={l , l1, l2, lT} guard1 r2 r3 pkg1 pkg2 • Load-truckpkg1truckl2. • Drive-trucktruckc1 to c2 via r1 lT. • Drive-trucktruckc2 to c4 via m2 l1. • Unload-truckpkg1truckl2. m1 m2 c1 c2 c4 r1 4-steps (l1)
Longer Plan, Fewer Compromises c3 L={l , l1, l2, lT} guard1 r2 r3 pkg1 pkg2 • Load-truckpkg1truckl2. • Drive-trucktruckc1 to c2 via r1 lT. • Load-truckpkg2trucklT. • Drive-trucktruckc2 to c3 via r2 lT. • Drive-trucktruckc3 to c4 via r3 lT. • Unload-truckpkg1truckl2, pkg2trucklT. m1 m2 c1 c2 c4 r1 6-steps (l2)
Limited Graph Expansion • A plan with satisfaction degree la has been found. • Because of min aggregation: • A plan with a higher satisfaction degree than la cannot contain any action with satisfaction degree la. • When expanding graph do not add actions with satisfaction degrees la. • Effect: • Reduce size of planning graph/sub-problems.
Satisfaction Degree Propagation Levela Levela+1 Action1 Action2 • Action2 has single precondition, effect of Action1. • Only way to support selection of Action2 at levela+1 is by also selecting Action1 at levela. • If known when solving sub-problem at levela+1, can possibly prune branch earlier. • So propagate sat degrees forwards as graph expanded. l1
Satisfaction Degree Propagation • Stage 1: • Label proposition nodes with max sat degree of those attached to all actions that assert it as an effect. Levela Levela+1 Action1 Action3 l2 l1 l2
Satisfaction Degree Propagation • Stage 2: • Action satisfaction degree =Min(own sat degree, min(sat degrees attached to each precondition)). Levela Levela+1 Action1 Action3 l2 l2 l1 l2
Results: FGP vs Boolean Solving • Short compromise plans can often be found very quickly.
Utility of Limited Graph Expansion/Satisfaction Propagation • Limited Graph Expansion, Satisfaction Propagation are Complementary.
Flexible Graphplan: Observations • It is more expensive to search for a range of plans than for one compromise-free plan. • But it is often possible to find short, compromise plans quickly. • Supports anytime behaviour. • Range of plans trade length versus number and severity of the compromises made.
Drowning and Leximin Ordering • Low satisfaction degree from one action: • Drowns the others because of min aggregation. • Leximin ordering: • Sort satisfaction degree vector associated with a solution. • Compare lexicographically: • {l2, l3, l3} >lex {l2, l2, l3} • Find compromise plans that min based search misses.
Solution (Leximin) c3 L={l , l1, l2, lT} guard1 r2 r3 pkg1 pkg2 • Load-truckpkg1truckl2. • Drive-truck truck c1 to c2 via r1 lT. • Load-truckpkg2trucklT. • Drive-truck truck c2 to c4 via m2 l1. • Unload-truckpkg1truckl2, pkg2trucklT. m1 m2 c1 c2 c4 r1 Also finds 2 more plans that FGP misses. 5-steps (l1, l2, l2, l2)
Finding Leximin-optimal Plans: Leximin FGP • Again, same basic search process. • Sub-problems are now leximinfuzzy CSPs. • Overall search is branch and bound: • Find a plan with higher satisfaction degree vector than highest currently known. Goal Sub-problem
Enhancements • Limited graph expansion works in the same way. • Must now propagate satisfaction degree vectors. • Stage 1: • Label proposition nodes with max sat degree vector of those attached to all actions that assert it as an effect. Levela Levela+1 Action1 Action3 {l2, l3} {l1} {l1, l2} {l2, l3}
Satisfaction Degree Vector Propagation • Stage 2: • Action satisfaction degree vector combines satisfaction degree of this action, and • Satisfaction degree vectors of each precondition proposition. Levela Levela+1 Action1 Action3 {l2, l3} {l1, l2, l3} {l1, l2} Action2 {l2, l3}
Removing Duplicates {l1} Levela Levela+1 Action1 Action2 • Not correct: {l1, l1, l3} • Only one instance of l1 is guaranteed by selecting Action2. • Solution: make satisfaction degrees unique. • Composite objects referring to the action that created them. • Simple matter to remove duplicates. {?, l3} {l1} {l1}
Results: BBFGP vs. LFGP • More compromise plans found effectively.
Results: BBFGP vs. LFGP • Effectiveness of satisfaction vector propagation problem-dependent.
Results: BBFGP vs. LFGP • Larger |L| can mean many possible compromises.
Results: Flexible Logistics • Explains time difference. • LFGP* in particular is solving many more sub-problems.
Utility of Satisfaction Degree Vector Propagation • Never degrades performance. • Overhead of propagation and duplicate removal is compensated for by performance gain. • Sometimes hugely improves performance: • 17 times is best result so far. • Propagation allows branches of search to be pruned much earlier.
Leximin FGP: Observations • More costly than FGP (efficiency is being improved). • But, effectively produces a greater range of compromise solutions. • Removes drowning. • Not limited by size of L. • In min version, can only be one plan of sat degree l1, one of sat degree l2…
Conclusions • Flexible planning overcomes the inability to compromise in classical AI planning. • Flexible planners produce a range of solutions from a given input problem from which the user can select. • Trade length versus compromises made. • FGP and LFGP planners effectively solve these problems using hierarchical decomposition of the planning graph.
Related Work • Conformant Planning: • Knowledge about possible initial states, and possible outcomes of each action. • Contingent Planning: • Sensing actions to detect the state of the world during execution. • Numerically weighted constraints: • Quantitative means of differentiating plans. • Pyrrhus: • Replaces goal formulae with utility models. • Does not associate utilities with individual actions.
Future Work • Reasoning about: • Time. • Resources. • Truth degrees go some way towards this. • Further efficiency improvements: • Improve quality of memoisation. • Smaller memosets are better.