210 likes | 348 Views
Planning Graph-based Heuristics for Cost-sensitive Temporal Planning. Minh B. Do & Subbarao Kambhampati CSE Department, Arizona State University {binhminh,rao}@asu.edu. Motivation. Multi-dimensional nature of plan quality in metric temporal planning: Temporal quality (e.g. makespan, slack)
E N D
Planning Graph-based Heuristics for Cost-sensitive Temporal Planning Minh B. Do & Subbarao Kambhampati CSE Department, Arizona State University {binhminh,rao}@asu.edu
Motivation • Multi-dimensional nature of plan quality in metric temporal planning: • Temporal quality (e.g. makespan, slack) • Plan cost (e.g. cumulative action cost, resource consumption) • Necessitates multi-objective optimization: • Modeling objective functions • Tracking different quality metrics and heuristic estimation Challenge: There may be inter-dependent relations between different quality metric
Tempe Los Angeles Phoenix Example • Option 1: Tempe Phoenix (Bus) Los Angeles (Airplane) • Less time: 3 hours; More expensive: $200 • Option 2: Tempe Los Angeles (Car) • More time: 12 hours; Less expensive: $50 • Given a deadline constraint (6 hours) Only option 1 is viable • Given a money constraint ($100) Only option 2 is viable
General Problem Problem specification Objective function Planner Good quality solution We do not investigate We investigate • How to design objective function? • User define • Learning users utility model Given the objective function that involve both time and cost quality Finding heuristics that sensitive to the cost function
Our approach • Using the Temporal Planning Graph (Smith & Weld) structure to track the time-sensitive cost function: • Estimation of the earliest time (makespan) to achieve all goals. • Estimation of the lowest cost to achieve goals • Estimation of the cost to achieve goals given the specific makespan value. • Using those information to calculate the heuristic value for the objective function involving both time and cost
Outline • Action representation and Temporal Planning Graph • Time sensitive cost functions: • Cost propagation using the temporal planning graph. • Termination criteria for the cost propagation process. • Deriving heuristic values from cost functions • Direct calculation • Heuristic by relaxed plan extraction • Empirical evaluation • Conclusion and future work
Action Representation At(package,place) In(package,truck) • Similar to PDDL2.1 Level 3: • Actions have non-uniform durations and may consume resources • Preconditions are true at start point or hold true for the action duration. • Effects at start or end points. Load(package,truck,place) At(package,place) At(truck,place)
Tempe Los Angeles Phoenix Drive-car(Tempe,LA) Heli(T,P) Airplane(P,LA) Shuttle(T,P) t = 10 t = 0 t = 0.5 t = 1 t = 1.5 The (Relaxed) Temporal PG
Tempe L.A Phoenix Time-sensitive Cost Function cost • Standard (Temporal) planning graph (TPG) shows the time-related estimates e.g. earliest time to achieve fact, or to execute action • TPG does not show the cost estimates to achieve facts or execute actions $300 $220 $100 0 1.5 2 10 time Drive-car(Tempe,LA) Airplane(P,LA) Heli(T,P) Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hour Helicopter(Tempe,Phx): Cost: $100; Time: 0.5 hour Car(Tempe,LA): Cost: $100; Time: 10 hour Airplane(Phx,LA): Cost: $200; Time: 1.0 hour Shuttle(T,P) t = 10 t = 0 t = 0.5 t = 1 t = 1.5
Drive-car(Tempe,LA) Tempe Airplane(P,LA) Hel(T,P) Airplane(P,LA) Shuttle(T,P) L.A Phoenix t = 10 t = 0 t = 0.5 t = 1 t = 1.5 t = 2.0 0.5 Estimating the Cost Function Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hour Helicopter(Tempe,Phx): Cost: $100; Time: 0.5 hour Car(Tempe,LA): Cost: $100; Time: 10 hour Airplane(Phx,LA): Cost: $200; Time: 1.0 hour $300 $220 $100 $20 time 0 1 1.5 2 10 Cost(At(LA)) Cost(At(Phx)) = Cost(Flight(Phx,LA))
Cost Propagation • Issues: • At a given time point, each fact is supported by multiple actions • Each action has more than one precondition • Propagation rules: • Cost(f,t) = min {Cost(A,t) : f Effect(A)} • Cost(A,t) = Aggregate(Cost(f,t): f Pre(A)) • Sum-propagation: Cost(f,t) • Max-propagation: Max {Cost(f,t)} • Combination: 0.5 Cost(f,t) + 0.5 Max {Cost(f,t)}
Termination Criteria cost • Deadline Termination: Terminate at time point t if: • goal G: Dealine(G) t • goal G: (Dealine(G) < t) (Cost(G,t) = • Fix-point Termination: Terminate at time point t where we can not improve the cost of any proposition. • K-lookahead approximation: At t where Cost(g,t) < , repeat the process of applying (set) of actions that can improve the cost functions k times. $300 $220 $100 0 1.5 2 10 time Earliest time point Cheapest cost Drive-car(Tempe,LA) Plane(P,LA) H(T,P) Shuttle(T,P) t = 0 0.5 1.5 1 t = 10
Heuristic estimation using the cost functions The cost functions have information to track both temporal and cost metric of the plan, and their inter-dependent relations !!! • If the objective function is to minimize time: h = t0 • If the objective function is to minimize cost: h = CostAggregate(G, t) • If the objective function is the function of both time and cost O = f(time,cost) then: h = min f(t,Cost(G,t))s.t.t0 t t Eg:f(time,cost) = 100.makespan + Costthen h = 100x2 + 220 at t0 t = 2 t cost $300 $220 $100 0 t0=1.5 2 t = 10 time Cost(At(LA)) Earliest achieve time: t0 = 1.5 Lowest cost time: t = 10
Heuristic estimation by extracting the relaxed plan • Relaxed plan (Hoffman) satisfies all the goals ignoring the negative interaction: • Take into account positive interaction • Base set of actions for possible adjustment according to neglected (relaxed) information (e.g. negative interaction, resource usage etc.) Need to find a good relaxed plan (among multiple ones) according to the objective function
Heuristic estimation by extracting the relaxed plan cost • General Alg.: Traverse backward searching for actions supporting all the goals. When A is added to the relaxed plan RP, then: Supported Fact = SF Effects(A) Goals = SF \ (G Precond(A)) • Temporal Planning with Cost: If the objective function is f(time,cost), then A is selected such that: f(t(RP+A),C(RP+A)) + f(t(Gnew),C(Gnew)) is minimal(Gnew = (G Precond(A)) \ Effects) • Finally, using mutex to set orders between A and actions in RP so that less number of causal constraints are violated $300 $220 $100 0 t0=1.5 2 t = 10 time Tempe L.A Phoenix f(t,c) = 100.makespan + Cost
Empirical evaluation • Objective: • Demonstrate that metric temporal planner armed with our approach is able to produce plans that satisfy a variety of cost/makespan tradeoff. • Testing problems: • Randomly generated logistics problems from TP4 (Hasslum&Geffner) Load/unload(package,location):Cost = 1; Duration = 1; Drive-inter-city(location1,location2):Cost = 4.0; Duration = 12.0; Flight(airport1,airport2):Cost = 15.0; Duration = 3.0; Drive-intra-city(location1,location2,city):Cost = 2.0; Duration = 2.0;
Empirical Results Results over 20 randomly generated temporal logistics problems involve moving 4 packages between different locations in 3 cities: O = f(time,cost) = .Makespan + (1- ).TotalCost
Empirical Results (cont.) • Higher look-ahead option generally produces better results in term of solving times and quality • Relaxed plan heuristic is generally more informative than the direct plan heuristic
Related Work • TGP, TP4 aim at makespan optimization (do not consider cost) • MO-GRT does multi-criteria search, but does not exploit the inter-dependent relations between them. • ASPEN (JPL) uses the iterative repairing technique to improve multi-dimensional plan quality
Conclusion • Introduced the time-sensitive cost functions to guide the heuristic search according to the objective functions involving both time (makespan) and monetary action cost: • Propagating cost function while building the temporal planning graph • Extract the heuristic values using the cost function • Preliminary experiment result with Sapa showing the utilities of the time-sensitive cost functions
Future Work • Experiments with domains and problems from the planning competition • Improving the cost function by better propagation rules, mutex information when building the temporal planning graph (TGP approach) • Heuristics for tracking other types of planning qualities such as execution flexibility • Multi-objective search involving non-combinable criteria