300 likes | 416 Views
Extending Graphplan to handle Resources Presenter: Pham Van Cuong Department of Computer Science New Mexico State University. Motivation. Planning with Resources is ubiquitous in real life. Actions in the real world often need resources to execute. Motivation.
E N D
Extending Graphplan to handle Resources Presenter: Pham Van Cuong Department of Computer Science New Mexico State University
Motivation • Planning with Resources is ubiquitous in real life. • Actions in the real world often need resources to execute.
Motivation • Our approach to planning with Resources is based on Graphplan, a well-known planning algorithm. • Techniques that make Graphplan attractive: • Polynomial time construction of planning Graph. • Use of mutexes to enhance the search for a plan.
Outline Graphplan background STRIPS language Planning Graph & Mutexes. Planning with Resources. GPR- A Graphplan with Resources Input language (PDDL 2.1 level 2) Data structures Mutexes Algorithm Experimental Results.
STRIPS language • A STRIPS action a is specified by an expression of the form action a :Pre Pre(a) :Add Add(a) :Del Del(a) For example, Action Drive(Car,LC,EP) :Pre {At(Car,LC),Has-fuel(Car)} :Add {At(Car,EP)} :Del {At(Car, LC),Has-fuel(Car)}
STRIPS language • The result of executing an action a in a state s is Res(a,s) = (s Add(a)) \ Del(a ) if a is executable, Res(a,s) = if otherwise. • The result of executing a sequence of actions [a1, a2 …, an] in a state s is • Res([ ],s)=s • Res([a1, a2 …, an],s) = Res(an,Res([a1, a2 …, an-1],s)), where Res(a, ) = for every a.
STRIPS language • A planning problem is a tuple <P,A,I,G>, where P is a finite set of fluents, A is a finite set of actions, I (the initial state) is a set of fluents, and G (the goal) is a finite set of fluent literals. • Given a planning problem Q=<P,A,I,G>, a sequence of actions [a1, a2 …, an] is a solution (plan) to Q if Res([a1, a2 …, an],I) is defined and G holds in Res([a1, a2 …, an],I).
Graphplan – Planning Graph • a directed, leveled graph with a set of nodes and a set of edges. • The levels alternate between proposition levels and action levels. • The proposition levels contain proposition nodes, each of which is labeled with a fluent literal. • The action levels contain action nodes, each has an action as its label.
Graphplan – Planning Graph • An edge presents the relation between an action and a proposition. • At time t, action nodes are connected to: • their preconditions in the proposition level t by precondition edges. • their add–effects and del-effects in proposition level t+1 by add-edges and del-edges, respectively.
Graphplan – mutex • Two actions A and B are mutex each other if: • action A deletes a precondition or an add-effects of B or vice versa. • a precondition of action A and a precondition of action B are mutex in the previous proposition level. • Two propositions p and q are mutex if: all ways of creating p are mutex with all ways of creating q.
Current approaches to Planning with Resources- some characteristics • State based search (Metric FF, LGP…). • Using heuristic function to guide search (Sapa, TP4 …) • Forward chaining approach (TLPlan …) • No existing planner uses mutexes of planning Graph to guide the search.
GPR- A Graphplan with Resources • Based on Graphplan algorithm. • Use of mutexes to direct the search for a plan. • generate a concurrent plan.
GPR- Input language (syntax) • F=FBU FN where FB is the set of boolean fluents and FN is the set of numeric fluents. • An assignment is of the form f=v where f F and v Df A set δ of assignments is: • consistent if for every fluent f F there exists at most one assignment of the form f=v in δ. • complete if for every fluent f F there exists at least one assignment of the form f=v in δ.
GPR- Input language (syntax) • A numeric constraint is a triple (exp1, comp, exp2) where comp {>,=,<,≥, ≤} is a comparator. • A numeric effect is a tuple of the form (f, aop, exp) where f FN , aop {assign,increase, decrease,scale-up,scale-down} is an assignment operator. • A condition con is a pair (b(con),n(con)) where b(con) FB and n(con) is a set of numeric constraints.
GPR- Input language (syntax) An action a is a pair (Pre(a),Eff(a)), where • pre(a) is a condition • eff(a) is a triple (b-add(eff(a)), b-del(eff(a)) ,n(eff(a))); b-add(eff(a)), b-del(eff(a)) FB; n(eff(a)) is a set of numeric effects which does not contain two numeric effects (f,aop,exp) and (f,aop’,exp’). For example, action FLY (plane,EP,LAX) Pre: ( {At(plane,EP)}, {(> (fuel plane) 300)}) Eff: ({At(plane,LAX)} , {At(plane,EP) }, {(decrease (fuel plane) 200)} ).
GPR- Input language (semantics) • Astate is a consistent and complete set of assignments. • An assignment f=v holds in a state s, denoted by s ╞(f=v), if f=v s. A set of assignments δholds in s, denoted by s ╞δ, if for all f=v δ s ╞ (f=v). • A numeric constraint (exp1, comp, exp2) holds in a state s, denoted by s ╞(exp1, comp, exp2), if both exp1 and exp2 are defined in s and exp1 comp exp2 holds.
GPR- Input language (semantics) • A set of numeric constraints C holds in a state s if s ╞ c for every numeric constraint c C. • A condition con=(b(con),n(con)) holds in a state s (s ╞ con) if s ╞ b(con) and s ╞ n(con). • An action a is executable in a state s if s ╞ Pre(a) .
GPR- Input language (semantics) • A state transition Res(a,s), if a is an executable action in s, contains the following assignments: • f=true if f b-add(eff(a)) • f=false if f b-del(eff(a)) • f=s(exp) if (f,assign,exp) (eff(a)) • f=s(f)+ s(exp) if (f,increase,exp) (eff(a)) • f=s(f)- s(exp) if (f,decrease,exp) (eff(a)) • f=s(f)* s(exp) if (f,scale-up,exp) (eff(a)) • f=s(f)/s(exp) if (f,scale-down,exp) (eff(a)) • f=s(f) if there does not exist the fluent f in the left hand side in every assignment f=v Res(a,s) .
GPR- Input language (semantics) • if a is not executable in s, then Res(a,s)= (or undefined). • For a sequence [a1, a2,.., an] of actions, Res([a1, a2,.., an],s) = Res(an, Res([a1, a2,.., an-1],s) and Res([ ],s)=s, where Res(a, )= for every a.
GPR- Input language (semantics) • A planning problem is a tuple (F,A,I,G), where F = FBU FN , A is a finite set of actions, I (the initial state) is a set of assignments, and G (the goal) is a condition. • A solution (plan) to a numeric planning problem is a sequence [a1, a2 …, an] of actions if Res([a1, a2 …, an],I) ╞G and Res([a1, a2 …, an],I) is defined . • The semantics can be extended to allow parallel actions.
GPR - Planning Graph • A directed, leveled graphwith a set of nodes and a set of edges. • The levels alternate between fluent levels and action levels. • The action levels contain action nodes, each is labeled with an executable action in that level. • The fluent levels contain fluent nodes, each of which is labeled with an assignment .
GPR - Planning Graph An edge presents the relation between an action and an assignment. At time t, each action node a is connected: • to assignments f=v which make Pre(a) hold in the fluent level t, denoted by Pre(a,t) ╞(f=v), by incoming edges. • to assignments created by a ’s effect in the fluent level t+1 by outgoing edges
GPR– mutex between actions A and B at level t • Inconsistent effects: • An add-effect of A is negated by B or vice versa. • Interference : • One of the del-effects of A is a precondition of B or vice versa. • There exist two mutexed assignments f1=v1 and f2=v2 in level t and Pre(A,t)╞(f1=v1) and Pre(B,t)╞(f2=v2).
GPR – mutex between assignments • Two assignments f1=v1 and f2=v2 are mutex at time t if: • f1 = f2 and v1≠ v2 ; or, • all ways of creating f1=v1 are mutex with all ways of creating f2=v2.
GPR – Algorithm description • GPR algorithm alternates between two phases: constructing the planning Graph and extracting a solution. • The planning Graph is constructed until the planning Graph is leveled off or a valid plan is found. • Extracting a solution phase starts whenever the goal is reached.
GPR – Constructing planning Graph • The fluent level 1 consists of all assignments in the initial state. • Once an executable action A is found at time t, GPR will do the followings. • creates an action node in the level t and labels it with A. • for each assignment f=v in the fluent level t s.t. pre(A,t) ╞(f=v), adds an edge connecting it to A. • for each assignment f=v created by some effect of A, GPR creates a fluent node with the label f=v, adds it to the fluent level t+1, and then inserts an edge from A to this node. • finds all action nodes in level t which are mutex with A and updates the mutex list.
GPR- Extracting a solution • Given a goal Gt at time t, GPR non-deterministically selects a set of actions At-1 and computes the goal Gt-1 as follows. • At-1 is a set of actions in level t-1 s.t. for each g Gt there exists one edge from some a At-1 to g. • Gt-1 is the set of assignments in level t-1 s.t. every action a At-1 is executable in Gt-1 • If t=0 and Gt I, this indicates that a solution is found.
GPR- Experimental Results • GPR generates a concurrent plan for the Rocket domain in 3 time steps with. • For the Rocket Domain with renewable Resources, GPR also generates a plan in good quality. • GPR is available on www.cs.nmsu.edu/~cvan
Future works • GPR is the first step towards creating a planner for domains with resources. It can be improved by: • Finding a plan that satisfies some constraints (eg. minimal resource consumption…). • Considering actions with duration.