470 likes | 571 Views
IAIP Week 11 Planning II: Planning in the real world RN12 (except 12.1, conditional planning in partially observable environments, 12.6, 12.7) (partly based on Tom Lenaerts' slide set). Talk in AI Game course. Coming Friday, 18.11.2005, from 09:00 till 11:00, Peter Andreasen from IO
E N D
IAIP Week 11Planning II: Planning in the real worldRN12 (except 12.1, conditional planning in partially observable environments, 12.6, 12.7)(partly based onTom Lenaerts' slide set)
Talk in AI Game course • Coming Friday, 18.11.2005, from 09:00 till 11:00, Peter Andreasen from IO • Interactive [www.ioi.dk] will hold a guest lecture in Auditorium 1. • This lecture is the second of two invited lectures, that are part of the • If you are interested you are most welcome to attend it. • 'Top 5 Reasons Not to use AI in Computer Games' RMJ Computational Logicand Algorithms Group at ITU
Last time RMJ Computational Logicand Algorithms Group at ITU
Planning as state space search Forward Backward RMJ Computational Logicand Algorithms Group at ITU
Heuristics for state-space search • Neither progression or regression are very efficient without a good heuristic • How many actions are needed to achieve the goal? • Exact solution is NP hard, find a good estimate • HSPrADD and HSPrMAX heuristics • ignore del-lists and perform a relaxed forward search • The cost of solving a conjunction of subgoals is approximated by the sum/max of the costs of solving the subproblems independently. a b a=2, c=1, p=0 c q p G p c hadd = 2 + 1 + 0 = 3 hmax = 2 a RMJ Computational Logicand Algorithms Group at ITU
Graph-plan heuristic • A mutex relation holds between two actions when (not applicable in parallel): • Inconsistent effects: one action negates the effect of another. • Interference: one of the effects of one action is the negation of a precondition of the other. • Competing needs: one of the preconditions of one action is mutually exclusive with the precondition of the other. • A mutex relation holds between two literals when (inconsistent support): • If one is the negation of the other OR • if each possible action pair that could achieve the literals is mutex. RMJ Computational Logicand Algorithms Group at ITU
Planning as plan-space search rightSockOn rightShoeOn RightShoe rightShoeOn Start Finish leftSockOn leftSockOn leftShoeOn leftShoeOn LeftSock LeftShoe Actions: {RightShoe, LeftSock, LeftShoe, Start, Finish} Orderings: {LeftSock < LeftShoe, Start < X, X < Finish} Causal link: {LeftSock-leftSockon->LeftShoe, RightShoe-rightShoeOn->Finish, LeftShoe-leftShoeOn->Finish} Open preconditions {rightSockOn} RMJ Computational Logicand Algorithms Group at ITU
Planning as SAT checking • Planning can be done by proving theorem in situation calculus • Situation calculus: each ground atom is timestampede.g. At(P1,SFO)0 At(P2,JFK)0 At(P1,JFK)1 • Idea: • define a Boolean variable for each ground atom and instantiated action • timestamp the variables with stamp 0-i • Add constraints of actions • Add constraints of goal and initial state • Check if the corresponding formula is satisfiable, if so, a plan exist of length i i 0 1 Fly(P1,JFK,SFO) Fly(P1,JFK,SFO) Fly(P1,JFK,SFO) atP1SFO atP1SFO atP1SFO Fly(P1,SFO,JFK) ... Fly(P1,SFO,JFK) Fly(P1,SFO,JFK) atP2JFK atP2JFK atP2JFK Fly(P1,JFK,SFO) Fly(P1,JFK,SFO) Fly(P1,JFK,SFO) atP1PIT atP1PIT atP1PIT Fly(P1,SFO,JFK) Fly(P1,SFO,JFK) Fly(P1,SFO,JFK) init atoms must be true goal atoms must be true Action variables and constraints RMJ Computational Logicand Algorithms Group at ITU
Today's program • Hierarchical task network (HTN) planning • Non-deterministic domains • Conditional planning • And-OR search • Strong planning • Execution monitoring and replanning • Continuous planning RMJ Computational Logicand Algorithms Group at ITU
Hierarchical task network planning RMJ Computational Logicand Algorithms Group at ITU
Hierarchical task network planning • Reduce complexity hierarchical decomposition • At each level of the hierarchy a computational task is reduced to a small number of activities at the next lower level. • Hierarchical task network (HTN) planning uses a refinement of actions through decomposition. Take over the world Cause conflict in EU Stop oil to USA Provoke China ... ... ... ... Make riots against Japan RMJ Computational Logicand Algorithms Group at ITU
Representation decomposition • General descriptions are stored in plan library. • Decompos(a,d), a=action, d=partial order plan (POP) • Decomposition start action • pre = all preconditions of actions not supplied by other actions (external preconditions) • Decomposition finish action • eff = all effects of actions not negated by actions (external effects) RMJ Computational Logicand Algorithms Group at ITU
Buildhouse example ... External effects External precond RMJ Computational Logicand Algorithms Group at ITU
Buildhouse example Action(Buyland, PRECOND: Money, EFFECT: Land ¬Money) Action(GetLoan, PRECOND: Goodcredit, EFFECT: Money Mortgage) Action(BuildHouse, PRECOND: Land, EFFECT: House) Action(GetPermit, PRECOND: LAnd, EFFECT: Permit) Action(HireBuilder, EFFECT: Contract) Action(Construction, PRECOND: Permit Contract, EFFECT: HouseBuilt ¬Permit), Action(PayBuilder, PRECOND: Money HouseBuilt, EFFECT: ¬Money House ¬¬Contract), Decompose(BuildHouse, Plan1 ::STEPS{ S1: GetPermit, S2:HireBuilder, S3:Construction, S4 PayBuilder} ORDERINGS: {Start < S1 < S3< S4<Finish, Start<S2<S3}, LINKS Decompose(BuildHouse, Plan2 :: ... RMJ Computational Logicand Algorithms Group at ITU
Properties of decomposition • High-level action • pre = external preconditions • eff = external effects • Correct decomposition of action a • plan dis complete and consistent POP plan for the problem where pre(start) pre(a), eff(finish) eff(a) • The high-level action is abstract: • only includes some preconditions and effects of decompositions • Ignores all internal effects of decompositions • Does not specify the intervals inside the activity during which preconditions and effects must hold. RMJ Computational Logicand Algorithms Group at ITU
Recapitulation of POP (1) • Assume propositional planning problems: • The initial plan contains Start and Finish, the ordering constraint Start < Finish, no causal links, all the preconditions in Finish are open. • Successor function : • picks one open precondition p on an action B and • generates a successor plan for every possible consistent way of choosing action A that achieves p. • Test goal RMJ Computational Logicand Algorithms Group at ITU
Recapitulation of POP (2) • When generating successor plan: • The causal link A--p->B and the ordering constraing A < B is added to the plan. • If A is new also add start < A and A < B to the plan • Resolve conflicts between new causal link and all existing actions • Resolve conflicts between action A (if new) and all existing causal links. RMJ Computational Logicand Algorithms Group at ITU
Adapting POP to HTN planning • Idea: modify the successor function: apply decomposition to current plan • NEW Successor function: • Select non-primitive action a’in P • For any Decompose(a,d) method in library where a and a’ unify with substitution • Replace a’ with d’ = subst(,d) RMJ Computational Logicand Algorithms Group at ITU
POP+HTN example a' RMJ Computational Logicand Algorithms Group at ITU
POP+HTN example a' d' Decomposition is like transplant surgery! RMJ Computational Logicand Algorithms Group at ITU
How to hook up d in a’? • Remove action a’ from Pand replace withd • For each step s in d’ select an action that will play the role of s (either new s or existing s’ from P) • Possibility of subtask sharing • Connect ordering steps for a’ to the steps in d’ • Put all constraints so that constraints of the form B < a’ and a’ < B are maintained by most relaxed new orderings(i.e., orderings reflecting the cause of the original constraints) • Connect the causal links • If B -p-> a’ is a causal link in P, replace it by a set of causal links from B to all steps in d’ with preconditions p that were supplied by the start step • Idem for a’ -p-> C Other changes to POP:new conflict resolution technique for high-level actions: decompose action to remove conflict RMJ Computational Logicand Algorithms Group at ITU
Efficiency of HTN • BAD news: pure HTN planning is undecidable due to recursive decomposition actions. • Walk=make one step and walk • Resolve problems by • Rule out recursion. • Bound the length of relevant solutions, • Hybridize HTN with POP RMJ Computational Logicand Algorithms Group at ITU
Efficiency of HTN RMJ Computational Logicand Algorithms Group at ITU
Efficiency of HTN • How many decompositions exists for a plan with n actions? • Assume • b: applicable actions in each state • d: # of decompositions • k: # of actions in each decomposition • Progression planning: O(bn) RMJ Computational Logicand Algorithms Group at ITU
Efficiency of HTN • Assume • n: # steps in plan • d: # of decompositions • k: # of actions in each decomposition # actions at each decomposition level level 1 : 1 level 2 : k level 3 : k2 level logk n - 1:n/k level logk n: n Decomposable actions = (n-1)/(k-1) Complexity of HTN planning O(d(n-1)/(k-1)) RMJ Computational Logicand Algorithms Group at ITU
Nondeterministic planning RMJ Computational Logicand Algorithms Group at ITU
Non-deterministic domains • So far: fully observable, static and deterministic domains. • Agent can plan first and then execute plan with eyes closed • Uncertain environment: • Partially observable • Nondeterministic effects • Degree of effect uncertainty • Bounded: finite number of possible effects. • Unbounded: infinite number of possible effects. RMJ Computational Logicand Algorithms Group at ITU
Handling indeterminacy • Sensorless planning (conformant planning) • Find plan that achieves goal in all possible circumstances (regardless of initial state and action effects). • Conditional planning (Contingency planning) • Construct conditional plan with different branches for possible contingencies. • Execution monitoring and replanning • While constructing plan judge whether plan requires revision. • Continuous planning • Planning active for a life time: adapt to changed circumstances and reformulate goals if necessary. RMJ Computational Logicand Algorithms Group at ITU
Sensorless planning RMJ Computational Logicand Algorithms Group at ITU
Conditional planning • Deal with uncertainty by checking the environment to see what is really happening. • Used in fully observable and nondeterministic environments: • The outcome of an action is unknown. • Conditional steps will check the state of the environment. • How to construct a conditional plan? RMJ Computational Logicand Algorithms Group at ITU
Example, the vacuum-world RMJ Computational Logicand Algorithms Group at ITU
Conditional planning • Actions: left, right, suck • Propositions to define states: AtL, AtR, CleanL, CleanR • How to include nondeterminism? • Actions can have disjunctive effects • E.g. moving left sometimes fails Action(Left, PRECOND: AtR, EFFECT: AtLAtR) • Actions can have conditional effects Action(Left, PRECOND:AtR, EFFECT: AtL(AtLwhen cleanL: ¬cleanL) RMJ Computational Logicand Algorithms Group at ITU
Conditional planning • Conditional plans require conditional steps: • If <test> then plan_A else plan_B ex.: ifAtLCleanLthenRightelseSuck • Plans become trees • Games against nature: • Find conditional plans that work regardless of which action outcomes actually occur. • Example: Assume vacuum-world Initial state = AtR CleanL CleanR Double murphy: possibility of desposit dirt when moving to other square and possibility of despositing dirt when action is Suck and there is no dirt. RMJ Computational Logicand Algorithms Group at ITU
Game tree State node (OR node) chance node (AND node) RMJ Computational Logicand Algorithms Group at ITU
Conditional plans • A solution is a subtree that • Has a goal node at every leaf • Specifies one action at each of its state nodes (OR nodes) • Includes every outcome branch at each of the chance nodes (AND nodes) RMJ Computational Logicand Algorithms Group at ITU
Conditional plan Init:AtR CleanL CleanR Goal:AtL CleanL CleanR RMJ Computational Logicand Algorithms Group at ITU
And-Or-search algorithm function AND-OR-GRAPH-SEARCH(problem) returns a conditional plan or failure return OR-SEARCH(INITIAL-STATE[problem], problem, []) function OR-SEARCH(state, problem, path) returns a conditional plan or failure if GOAL-TEST[problem](state) then return the empty plan if state is on path then return failure for action,state_set in SUCCESSORS[problem](state) do plan AND-SEARCH(state_set,problem, [state | plan]) if plan failurethen return [action | plan] return failure function AND-SEARCH(state_set, problem, path) returns a conditional plan or failure for each si in state_setdo plani OR-SEARCH(si,problem,path ) if plani= failurethen return failure return [ ifs1thenplan1elseifs2thenplan2else … ifsn-1thenplann-1elseplann] RMJ Computational Logicand Algorithms Group at ITU
And-Or-search algorithm • How does it deal with cycles? • When a state that already is on the path appears, return failure • No non-cyclic solution • Ensures algorithm termination • The algorithm does not check whether some state is already on some other path from the root. RMJ Computational Logicand Algorithms Group at ITU
Strong Planning RMJ Computational Logicand Algorithms Group at ITU
Strong Planning RMJ Computational Logicand Algorithms Group at ITU
Strong Planning RMJ Computational Logicand Algorithms Group at ITU
Strong Planning RMJ Computational Logicand Algorithms Group at ITU
Strong planning • Strong planning is optimal • worst case execution length is shortest • Compact BDD representation • All frontier layers represented = universal plan RMJ Computational Logicand Algorithms Group at ITU
Monitoring and replanning • Execution monitoring: check whether everything is going as planned. • Unbounded indeterminancy: some unanticipated circumstances will arise. • A necessity in realistic environments. • Kinds of monitoring: • Action monitoring: verify whether the next action will work. • Plan monitoring: verify the entire remaining plan. RMJ Computational Logicand Algorithms Group at ITU
Monitoring and replanning • When something unexpected happens: replan • To avoid too much time on planning try to repair the old plan. • Can be applied in both fully and partially observable environments, and to a variety of planning representations. RMJ Computational Logicand Algorithms Group at ITU
Repair example RMJ Computational Logicand Algorithms Group at ITU
Plan monitoring • Check the preconditions for success of the entireplan. • Except those which are achieved by another step in the plan. • Execution of doomed plan is cut of earlier. • Limitation of replanning agent: • It can not formulate new goals or accept new goals in addition to the current one RMJ Computational Logicand Algorithms Group at ITU