200 likes | 304 Views
Effective Approaches for Partial Satisfaction (Over-subscription) Planning. Romeo Sanchez * Menkes van den Briel ** Subbarao Kambhampati *. * Department of Computer Science and Engineering ** Department of Industrial Engineering Arizona State University Tempe, Arizona. Outline. Background
E N D
Effective Approaches for Partial Satisfaction (Over-subscription) Planning Romeo Sanchez * Menkes van den Briel ** Subbarao Kambhampati * * Department of Computer Science and Engineering ** Department of Industrial Engineering Arizona State University Tempe, Arizona
Outline • Background • Example • Approaches • Optiplan • Altaltps • Sapaps • Planning graph heuristics • Results
Background In one day achieve the following 100 goals: RockData at WP 1, high-res pics at WP 2 & 3, …., SoilData at WP 100 For all your demands, you could’ve bought me a better flash memory stick at least! Given: Actions with costs, and goals with utilities, find a plan that has a highest {utility – cost} No way I can achieve that many goals in one day It’s hard but here is the best I can do: Goal1, Goal5, Goal99 • Previous Approaches: • Highest utility goal first • Estimating the set of most beneficial goals
Background • Complete satisfaction (traditional) planning • Goal state G is a list of conjunctions: G = g1 g2 … gn • A plan that achieves n – 1 goal fluents is as good as a plan that achieves 0 goal fluents • Partial satisfaction planning (PSP) • Goal state G is a list of fluents: G = {g1,g2 , …, gn} • Goal fluents might have utilities, actions might have costs, therefore achieving a partial plan might be more beneficial than the “null” plan. • Achieving all goal fluents might be impossible… • The goal state G may contain logically conflicting fluents • There might not be enough resources to achieve all fluents in G (:goal (and (pointing satellite1 moon) (pointing satellite1 mars) )) • (:goal (and (have_rock rover1 waypoint1) (have_rock rover1 waypoint2) ))
PSP problems • PSP Net benefit: • Given a planning problem P = (F, A, I, G), and for each action a “cost” ca 0, and for each goal fluent f G a “utility” uf 0,and a positive number k. Is there a finite sequence of actions = (a1, a2, …, an)that starting from I leads to a state S that has net benefit f(SG)uf – aca k. PLAN EXISTENCE PLAN LENGTH PSP GOAL PSP GOAL LENGTH PLAN COST PSP UTILITY PSP NET BENEFIT PSP UTILITY COST
Example • Getting from Las Vegas (LV) to San Jose (SJ) C: action cost U(G): utility of goal G G1,G2,G3,G4: goals P = {travel(LV,DL), travel(DL,SJ), travel(SJ,SF)} achieves G1, G2, G3
Approaches • Optiplan • Integer programming based STRIPS planner • Solves the PSP problem by encoding it as an integer program • Altaltps • Heuristic regression planner • Solves the PSP problem through a goal selection heuristic • Sapaps • Heuristic forward state space planner • Solves the PSP problem using an anytime A* algorithm
Optiplan • Optiplan planning system: • Combines Graphplan (Blum & Furst, 1995) with State Change Encoding (Vossen et al., 1999) • As in the Blackbox planning system, Graphplan reduces the encoding size generated by Optiplan • Computes optimal plans for a given parallel length • Objective: • fGUf (x_addf,n + x_preaddf,n + x_maintainf,n) – lL aACa ya,l • Sum of goal utilities – Sum of action cost
Objective 0 / Minimize #actions Constraints Fluent changes Satisfy initial state Satisfy goal Fluent implications Action implications Total satisfaction planning: goal satisfaction is treated as a hard constraint Objective Maximize net benefit Goal utility – action cost Constraints Fluent changes Satisfy initial state Fluent implications Actions implications Partial satisfaction planning: goal satisfaction is treated as a soft constraint Optiplan and partial satisfaction
AltAltps • AltAlt planning system • Heuristic state-space search planner (Nguyen, Kambhampati & Sanchez, 2002) • Combines Graphplan (Blum & Furst, 1995) with heuristic state-space search techniques (Bonet, Loerincs & Geffner, 1997; Bonet Geffner, 1999; McDermott 1999) • AltAltps planning system • Total enumeration on 2n goal subsets is too costly • Selects a promising subset of the top-level goals upfront • Searches for a plan using a regression state space search combined with cost-sensitive planning graph heuristics.
AltAltps cost propagation • Using a planning graph structure • Propositions in the initial state come for free (they have zero cost) • Other propositions have costs computed as follows: • Propagation procedures • Max-propagation • Sum-propagation 0 0 0 hl(p) = Cost of propositionpat levell 5 5 5 5 0 0 0 0 ifp I hl(p) = min{hl-1(p), cost(a) + Cl(a)} ifl > 0 otherwise 3 8 4 4 4 4 l=0 l=1 l=2 Cl(a) = max{hl-1(q) : q prec(a)} Cl(a) = q prec(a)hl-1(q)
AltAltps goal set selection • Main idea • Start with the original goal set G and an empty goal set G’ • Iteratively add goals to G’ as long as the estimated NET BENEFIT increases • The cost of adding another goal g to G’ depends on the goals that are already in G’ G’ g G’ Cost for achieving G’ Relaxed plan for G’ (R’p) Residual cost for g Rp for G’ g biased to re-use actions in R’p
AltAltps cost-sensitive relaxed plan heuristic • General procedure • States are ranked during search using the relaxed plan heuristic and the propagated costs • The idea is to compute the cost of a relaxed plan Rp in terms of the costs of the actions composing it. • Heuristic value for S equal h(S) = aRpcost(a) • Given a state S, remove the (sub)goal g from S that has highest hl(g) • Select the action that supports g with lowest cost (cost(a) + Cl(a)) • Regress S over a to get S’ = S prec(a) \ eff(a) • Stop when each proposition q S is present in the initial state
Nodes evaluation: g(S) = U(S) – C(S) h(S) = U(RP(S)) – C(RP(S)) Beneficial Node: g(S) > 0 or U(S) > C(S) Termination Node: V S’: g(S) > f(S’) SAPAPS: a forward A* approach for PSP Anytime A* Algorithm: Search through best beneficial nodes A5: SampleRock A1: Navigate(X,Y) A2: SampleSoil(Y) A4: Navigate(Y,Z) A3: TakePicture g(S) = Util(HasSoilData) – Cost(A1,A2) h(S) = Util(Apply(A3,S)) – Cost(A3) A*: f(S) = g(S) + h(S)
SAPAPS: heuristic • Heuristic: Variation of SAPA’s Approach • Heuristically extracting the least cost relaxed plan using cost-function • Remove “unbeneficial” goals and related actions G1 G2 G3 A1 G1 G2 A1 A3 → A3 A2 A4 C(A1) + C(A2) > U(G3)