460 likes | 602 Views
CSE 574 Automated Planning. Daniel S. Weld. What action next?. Agent Control: The Problem. Environment. Percepts. Actions. What’s Wrong with this Picture?. Motivating Domain #1. Motivating Domain #2. Applications (Current & Potential).
E N D
CSE 574Automated Planning Daniel S. Weld
What action next? Agent Control: The Problem Environment Percepts Actions
Applications (Current & Potential) • Scheduling problems with action choices as well as resource handling requirements • Problems in supply chain management • HSTS (Hubble Space Telescope scheduler) • Workflow management • Autonomous agents • RAX/PS (The NASA Deep Space planning agent) • Software module integrators • VICAR (JPL image enhancing system); CELWARE (CELCorp) • Test case generation (Pittsburgh) • Interactive decision support • Monitoring subgoal interactions • Optimum AIV system • Plan-based interfaces • E.g. NLP to database interfaces • Plan recognition • Web-service composition
Significant scale-up in the last 4-5 years Before we could synthesize about 5-6 action plans in minutes Now, we can synthesize optimal 100-action durative plans in minutes Further scale-up with domain-specific control Significant strides in our understanding Rich connections between planning and CSP(SAT) OR (ILP) Vanishing separation between planning & Scheduling New ideas for heuristic control of planners Wide array of approaches for customizing planners with domain-specific knowledge Lots of activity...
Today’s Agenda • Administrivia • Approaches to agent control • Simplifying assumptions • STRIPS – the easiest case • Overview of methods
Administrivia • Class Goals • Learn about planning • Practice critical reading • Gain experience as reviewers • Gain experience leading discussion (PC Mtg) • Extensible projects • Experience • [ Quals / Generals ] • [ AAAI / ICAPS Papers ]
Grading 20% paper summaries 25% organization of your class 20% class participation 35% project
Paper Summary Process • One [ two ] papers / class • Post reviews to class b-board • One-line summary • The most important ideas in the paper, and why • The one or two largest flaws in the paper • Identify two important, open research questions on the topic, and why they matter • Stimulate class discussion • Encourage detailed reading
Process for Student Class-Leads • You pick an area • Read papers ahead of time (I provide list) • Decide on best paper for class to read • We meet to discuss strategy • T- (1 week) for an hour • Class • We meet to discuss +/-
Project • Not too big • But potential for quals or publication • Individual or group • Lots of software available for quick start • More soon…
Administrivia • Add yourself to mailing list • Anything else?
Agent-Control Approaches • Reactive Control • Set of situation-action rules • E.g. • 1) if dog-is-behind-me then run-forward • 2) if food-is-near then eat • Planning • Reason about effect of combinations of actions • “Planning ahead” • Avoiding “painting oneself into a corner”
Ways to make “plans” • Generative Planning • Reason from first principles (knowledge of actions) to generate plan • Requires formal model of actions • Case-Based Planning • Retrieve old plan which worked for similar problem • Revise retrieved plan for this problem • Reinforcement Learning • Act randomly, noticing effects • Learn reward, action models, policy
Generative Planning • Input • Description of (initial state of) world (in some KR) • Description of goal (in some KR) • Description of available actions (in some KR) • Output • Controller • E.g. Sequence of actions • E.g. Plan with loops and conditionals • E.g. Policy = f: states -> actions
Input Representation • Description of initial state of world • E.g., Set of propositions: • ((block a) (block b) (block c) (on-table a) (on-table b) (clear a) (clear b) (clear c) (arm-empty)) • Description of goal (i.e. set of desired worlds) • E.g., Logical conjunction • Any world that satisfies the conjunction is a goal • (and (on a b) (on b c))) • Description of available actions
Static vs. Dynamic Instantaneous vs. Durative Deterministic vs. Stochastic Fully Observable vs. Partially Observable What action next? Perfect vs. Noisy Full vs. Partial satisfaction Simplifying Assumptions Environment Percepts Actions
(prec) Oi (effects) [ I ] [ G ] Oi Oj Ok Om Classical Planning Environment Static Instantaneous Deterministic Perfect Fully Observable Full I = initial state G = goal state
Ob Oa Oc Ok Real Class Focus • Durative Actions • Simultaneous actions, events, deadline goals • Planning Under Uncertainty • Modeling a robot’s (or softbot’s) sensors Oj ? [ I ] Oi
Dynamic Stochastic Partially Observable Durative Continuous Contingent/Conformant Plans, Interleaved execution MDP Policies Contingent/Conformant Plans, Interleaved execution Numeric Constraint reasoning (LP/ILP) Temporal Reasoning POMDP Policies Semi-MDP Policies Replanning/ Situated Plans Deterministic Static Observable Instantaneous Propositional “Classical Planning”
Neo-Classical Planning Broad Aims & Biases AIM: We will concentrate on planning in deterministic, quasi-static and fully observable worlds Will start with “classical” domains; but discuss handling durative actions and numeric constraints, as well as replanning • BIAS: To the extent possible, we shall shun brand-names • and concentrate on unifying themes • Better understanding of existing planners • Normalized comparisons between planners • Evaluation of trade-offs provided by various design choices • Better understanding of inter-connections • Hybrid planners using multiple refinements • Explication of the connections between planning, • CSP, SAT and ILP
Today’s Agenda • Administrivia • Approaches to agent control • Simplifying assumptions • STRIPS – the easiest case • Overview of methods
Why care about “classical” Planning? • Most advances seen first in classical planning • Many stabilized environments ~satisfy classical assumptions • It is possible to handle minor assumption violations through replanning and execution monitoring “ This form of solution has the advantage of relying on widely-used (and often very efficient) classical planning technology” Boutilier, 2000 • Techniques developed for classical planning often shed light on effective ways of handling non-classical planning worlds • Most of the efficient techniques for handling non-classical scenarios based on classical ideas/advances
How Represent Actions? • Simplifying assumptions • Atomic time • Agent is omniscient (no sensing necessary). • Agent is sole cause of change • Actions have deterministic effects • STRIPS representation • World = set of true propositions • Actions: • Precondition: (conjunction of literals) • Effects (conjunction of literals) a north11 north12 a a W0 W1 W2
north11 a a W1 W0 STRIPS Actions • Action =function from world-stateworld-state • Precondition says where function defined • Effects say how to change set of propositions Note: strips doesn’t allow derived effects; you must be complete! north11 precond: (and (agent-at 1 1) (agent-facing north)) effect: (and (agent-at 1 2) (not (agent-at 1 1)))
Action Schemata • Instead of defining: pickup-A and pickup-B and … • Define a schema: (:operatorpick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1) (on-table ?ob1) (arm-empty)) :effect (and (not (clear ?ob1)) (not (on-table ?ob1)) (not (arm-empty)) (holding ?ob1))) Note: strips doesn’t allow derived effects; you must be complete! }
Planning as Search • Nodes • Arcs • Initial State • Goal State World states Actions The state satisfying the complete description of the initial conds Any state satisfying the goal propositions
A C B C B A Forward-Chaining World-Space Search Initial State Goal State
D A D C B A E D C E B D C B A E A C B E Backward-Chaining Search Thru Space of Partial World-States • Problem: Many possible goal states are equally acceptable. • From which one does one search? Initial State is completely defined * * *
Plan Space • Forward chaining thru world-states • Backward chaining thru world-states
“Causal Link” Planning • Nodes • Arcs • Initial State • Goal State Partially specified plans Adding + deleting actions or constraints (e.g. <) to plan The empty plan (Actually two dummy actions…) A plan which when simulated achieves the goal Need efficient way to evaluate quality (percentage of preconditions satisfied) of partial plan … Hence causal link datastructures
pick-from-table(C) put-on(C,B) pick-from-table(C) pick-from-table(B) Plan-Space Search • How represent plans? • How test if plan is a solution?
Planning as Search 3Graphplan • Phase 1 - Graph Expansion • Necessary (insufficient) conditions for plan existence • Local consistency of plan-as-CSP • Phase 2 - Solution Extraction • Variables • action execution at a time point • Constraints • goals, subgoals achieved • no side-effects between actions
Planning Graph Proposition Init State Action Time 1 Proposition Time 1 Action Time 2
Constructing the planning graph… • Initial proposition layer • Just the initial conditions • Action layer i • If all of an action’s preconds are in i-1 • Then add action to layer I • Proposition layer i+1 • For each action at layer i • Add all its effects at layer i+1
Mutual Exclusion • Actions A,B exclusive (at a level) if • A deletes B’s precond, or • B deletes A’s precond, or • A & B have inconsistent preconds • Propositions P,Q inconsistent (at a level) if • all ways to achieve P exclude all ways to achieve Q
Graphplan • Create level 0 in planning graph • Loop • If goal contents of highest level (nonmutex) • Then search graph for solution • If find a solution then return and terminate • Else Extend graph one more level A kind of double search: forward direction checks necessary (but insufficient) conditions for a solution, ... Backward search verifies...
Searching for a Solution • For each goal G at time t • For each action A making G true @t • If A isn’t mutex with a previously chosen action, select it • If no actions work, backup to last G (breadth first search) • Recurse on preconditions of actions selected, t-1 Proposition Init State Action Time 2 Proposition Time 1 Action Time 1
Dinner Date Initial Conditions: (:and (cleanHands) (quiet)) Goal: (:and (noGarbage) (dinner) (present)) Actions: (:operator carry :precondition :effect (:and (noGarbage) (:not (cleanHands))) (:operator dolly :precondition :effect (:and (noGarbage) (:not (quiet))) (:operator cook :precondition (cleanHands) :effect (dinner)) (:operator wrap :precondition (quiet) :effect (present))
0 Prop 1 Action 2 Prop 3 Action 4 Prop Planning Graph cleanH quiet noGarb cleanH quiet dinner present carry dolly cook wrap
0 Prop 1 Action 2 Prop 3 Action 4 Prop Are there any exclusions? cleanH quiet noGarb cleanH quiet dinner present carry dolly cook wrap
0 Prop 1 Action 2 Prop 3 Action 4 Prop Do we have a solution? cleanH quiet noGarb cleanH quiet dinner present carry dolly cook wrap
0 Prop 1 Action 2 Prop 3 Action 4 Prop Extend the Planning Graph cleanH quiet noGarb cleanH quiet dinner present noGarb cleanH quiet dinner present carry dolly cook wrap carry dolly cook wrap
0 Prop 1 Action 2 Prop 3 Action 4 Prop One (of 4) possibilities cleanH quiet noGarb cleanH quiet dinner present noGarb cleanH quiet dinner present carry dolly cook wrap carry dolly cook wrap