240 likes | 401 Views
SAPA: A Domain-independent Heuristic Temporal Planner. Minh B. Do & Subbarao Kambhampati Arizona State University. Buenos dias, amigos. Obviamente este es al articulo de Binh Minh. De todas maneras, yo lo convenci de que seria mejor para el usar
E N D
SAPA: A Domain-independent Heuristic Temporal Planner Minh B. Do & Subbarao Kambhampati Arizona State University
Buenos dias, amigos. Obviamente este es al articulo de Binh Minh. De todas maneras, yo lo convenci de que seria mejor para el usar su tiempo en trabajar en otro articulo proximo mas que en visitar Toledo, un pueblo del oeste medio en Ohio. Yo entiendo que esta es basicamente la estrategia que Malik uso para presentar tambien el articulo de Romain.
Talk Outline • Temporal Planning and SAPA • Action representation and search algorithm • Objective functions and heuristics • Admissible/Inadmissible • Resource adjustment • Empirical results • Related & future work
Can winning strategies in classical planning be applicable in more expressive environments? Planning • Most academic research has been done in the context of classical planning: • Already P-SPACE complete • Useful techniques are likely to be applicable in more expressive planning problems • Real world application normally has more complex requirements: • Non-instantaneous actions • Temporal constraints on goals • Resource consumption Classical planning has been able to scale up to big problems recently
Related Work • Planners that can handle similar types of temporal and resource constraints: TLPlan, HSTS, IxTexT, Zeno • Cannot scale up without domain knowledge • Planners that can handle a subset of constraints: • Only temporal: TGP • Only resources: LPSAT, GRT-R • Subset of temporal and resource constraints: TP4, Resource-IPP
SAPA • Forward state space planner • Based on [Bachus&Ady]. • Make resource reasoning easier • Handles temporal constraints • Actions with static and dynamic durations • Temporal goals with deadlines • Continuous resource consumption and production • Heuristic functions to support a variety of objective functions
(in-city ?airplane ?city1) (in-city ?airplane ?city2) consume (fuel ?airplane) Flying (in-city ?airplane ?city1) (fuel ?airplane) > 0 • Consuming the same resource • One action’s effect conflicting with other’s precondition or effect Action Conflicts: Action Representation • Durative with EA = SA + DA • Instantaneous effects e at time • te = SA + d, 0 d DA • Preconditions need to be true at the starting point, and protected during a period of time d, 0 d DA • Action can consume or produce continuous amount of some resource
Set <pi,ti> of predicates pi and the time of their last achievement ti < t. Set of protected persistent conditions. Time stamp of S. Set of functions represent resource values. Event queue. Searching time-stamped states Search through the space of time-stamped states S=(P,M,,Q,t)
Search Algorithm (cont.) • Goal Satisfaction: S=(P,M,,Q,t) G if <pi,ti> G either: • <pi,tj> P, tj < ti and no event in Q deletes pi. • e Q that adds pi at time te < ti. • Action Application: Action A is applicable in S if: • All instantaneous preconditions of A are satisfied by P and M. • A’s effects do not interfere with and Q. • No event in Q interferes with persistent preconditions of A. • When A is applied to S: • S is updated according to A’s instantaneous effects. • Persistent preconditions of A are put in • Delayed effects of A are put in Q. S=(P,M,,Q,t)
Classical Planning • Number of actions • Parallel execution time • Solving time • Temporal Resource Planning • Number of actions • Makespan • Resource consumption • Slack • ……. Heuristic Control Temporal planners have to deal with more branching possibilities More critical to have good heuristic guidance Design of heuristics depends on the objective function In temporal Planning heuristics focus on richer obj. functions that guide both planning and scheduling
Objectives in Temporal Planning • Number of actions: Total number of actions in the plan. • Makespan: The shortest duration in which we can possibly execute all actions in the solution. • Resource Consumption: Total amount of resource consumed by actions in the solution. • Slack: The duration between the time a goal is achieved and its deadline. • Optimize max, min or average slack values
Pruning a bad state while preserving the completeness. • Deriving admissible heuristics: • To minimize solution’s makespan. • To maximize slack-based • objective functions. Find relaxed solution which is used as distance heuristics Adjust the heuristic values using the resource consumption Information. Adjust the heuristic values using the negative interaction (Future work) Deriving heuristics for SAPA We use phased relaxation approach to derive different heuristics Relax the negative logical and resource effects to build the Relaxed Temporal Planning Graph [AltAlt,AIJ2001]
A B Person Airplane Person t=0 tg Load(P,A) Unload(P,A) Fly(A,B) Fly(B,A) Unload(P,B) Init Goal Deadline Relaxed Temporal Planning Graph Heuristics in Sapa are derived from the Graphplan-style bi-level relaxed temporal planning graph (RTPG) • Relaxed Action: • No delete effects • No resource consumption while(true) forallAadvance-time applicable in S S = Apply(A,S) ifSG then Terminate{solution} S’ = Apply(advance-time,S) if (pi,ti) G such that ti < Time(S’) and piS then Terminate{non-solution} elseS = S’ end while;
Heuristics directly from RTPG A D M I S S I B L E • For Makespan: Distance from a state S to the goals is equal to the duration between time(S) and the time the last goal appears in the RTPG. • For Min/Max/Sum Slack: Distance from a state to the goals is equal to the minimum, maximum, or summation of slack estimates for all individual goals using the RTPG. Proof: All goals appear in the RTPG at times smaller or equal to their achievable times.
A B Person Airplane Person t=0 tg Load(P,A) Unload(P,A) Fly(A,B) Fly(B,A) Unload(P,B) Init Goal Deadline Heuristics from Solution Extracted from RTPG RTPG can be used to find a relaxed solution which is then used to estimate distance from a given state to the goals Sum actions: Distance from a state S to the goals equals the number of actions in the relaxed plan. Sum durations: Distance from a state S to the goals equals the summation of action durations in the relaxed plan.
Resource-based Adjustments to Heuristics Resource related information, ignored originally, can be used to improve the heuristic values Adjusted Sum-Action: h = h + R (Con(R) – (Init(R)+Pro(R)))/R Adjusted Sum-Duration: h = h + R [(Con(R) – (Init(R)+Pro(R)))/R].Dur(AR) Will not preserve admissibility
Aims of Empirical Study • Evaluate the effectiveness of the different heuristics. • Ablation studies: • Test if the resource adjustment technique helps different heuristics. • Compare with other temporal planning systems.
Adjusted Sum-Action Sum-Duration Prob time #act nodes dur time #act nodes dur Zeno1 0.317 5 14/48 320 0.35 5 20/67 320 Zeno2 54.37 23 188/1303 950 - - - - Zeno3 29.73 13 250/1221 430 6.20 13 60/289 450 Zeno9 13.01 13 151/793 590 98.66 13 4331/5971 460 Log1 1.51 16 27/157 10.0 1.81 16 33/192 10.0 Log2 82.01 22 199/1592 18.87 38.43 22 61/505 18.87 Log3 10.25 12 30/215 11.75 - - - - Log9 116.09 32 91/830 26.25 - - - - Empirical Results • Sum-action finds solutions faster than sum-dur • Admissible heuristics do not scale up to bigger problems • Sum-dur finds shorter duration solutions in most of the cases • Resource-based adjustment helps sum-action, but not sum-dur • Very few irrelevant actions. Better quality than TemporalTLPlan. • So, (transitively) better than LPSAT
Comparison to other planners • Planners with similar capabilities • IxTet, Zeno • Poor scaleup • HSTS, TLPLAN • Domain dependent search control • Planners with limited capabilities • TGP and TGP • Compared on a set of random temporal logistics problem: • Domain specification and problems are defined by TP4’s creator (“P@trik” Haslum) • No resource requirements • No deadline constraints or actions with dynamic duration
Empirical Results (cont.) Logistics domain withdriving restricted to intra-city (traditional logistics domain) Sapa is the only planner that can solve all 80 problems
Empirical Results (cont.) Logistics domain with inter-city driving actions The “sum-action” heuristic used as the default in Sapa can be mislead by the long duration actions... Future work on fixed point time/level propagation
Conclusion • Presented SAPA, a domain-independent forward temporal planner that can handle: • Durative actions • Deadline goals • Continuous resources • Developed different heuristic functions based on the relaxed temporal planning graph to address both satisficing and optimizing search • Method to improve heuristic values by resource reasoning • Promising initial empirical results
Future Work • Exploit mutex information in: • Building the temporal planning graph • Adjusting the heuristic values in the relaxed solution • Relevance analysis • Improving solution quality • Relaxing constraints and integrating with full-scale scheduler