860 likes | 876 Views
Learn about sensing actions, belief states, observability modeling, and progression algorithms in conditional planning systems with detailed explanations and examples.
E N D
No Final. Instead we will have a last homework Midterm to be returned Thursday; Homework reached Hanoi Extra class on Thursday? 4/29: Conditional Planning
Conformant Heuristics
Sensing: General observations • Sensing can be thought in terms of • Speicific state variables whose values can be found • OR sensing actions that evaluate truth of some boolean formula over the state variables. • Sense(p) ; Sense(pV(q&r)) • A general action may have both causative effects and sensing effects • Sensing effect changes the agent’s knowledge, and not the world • Causative effect changes the world (and may give certain knowledge to the agent) • A pure sensing action only has sensing effects; a pure causative action only has causative effects. • When applied to a belief state, the sensing effects of an action wind up reducing the cardinality of that belief state • basically by removing all states that are not consistent with the sensed effects • Sensing actions introduce branches into the plans • If you apply Sense-A? to a belief state B, you get a partition of B: BA and B~A • You will have to make a plan for both the branches. And/Or search in the space of belief states
First we will see a model where observability is in terms of state variables Next we shall see a model where observability can be in terms of formulas [Bonet&Geffner]
Modeling observability in terms of observation actions that give values of single state variables If a state variable p Is in B, then there is some action Ap that Can sense whether p is true or false
A Simple Progression Algorithm in the presence of pure sensing actions • Call the procedure Plan(BI,G,nil) where • Procedure Plan(B,G,P) • If G is satisfied in all states of B, then return P • Non-deterministically choose: • I. Non-deterministically choose a causative action a that is applicable in B. • Return Plan(a(B),G,P+a) • II. Non-deterministically choose a sensing action s that senses a formula f (could be a single state variable) • Let p’ = Plan(Bf,G,nil); p’’=Plan(B~f,G,nil) • /*Bf is the set of states of B in which f is true */ • Return P+(s?:p’;p’’) If we always pick I and never do II then we will produce conformant Plans (if we succeed).
Remarks on the progression with sensing actions • Progression is implicitly finding an AND subtree of an AND/OR Graph • If we look for AND subgraphs, we can represent looping plans. • The amount of sensing done in the eventual solution plan is controlled by how often we pick step I vs. step II (if we always pick I, we get conformant solutions). • Progression is as clue-less as to whether to do sensing and which sensing to do, as it is about which causative action to apply • Need heuristic support
5/1: Conditional Planning (contd) Midterms returned Monty Python on Conformant Planning Next class *REQUIRED READING* assigned Can we meet sometime on Monday (instead of reading class?)
Very simple Example Problem: Init: don’t know p Goal: g A1 p=>r,~p A2 ~p=>r,p A3 r=>g O5 observe(p) Plan: O5:p?[A1A3][A2A3] O5:p? Y N A1 A2 A3 A3
define (domain d) ... (:predicates (P1) (P2) (P3) (P4) (P5) (P6)) ...) (define (problem p) .... (:init (and (P1) (oneof (and (P2) (P3)) (P4)) (unknown (P5)))) ....) define (problem p) .... (:effect (and (P1) (oneof (and (P2) (P3)) (P4)) (unknown (P5)))) ....) (:observation wall_north - boolean :parameters () (iff (= wall_north 1) (or (= (robot_y) north) (= (robot_x) west)))) ;.... (:observation wall_east - boolean :parameters () (imply (= wall_east 0) (= (robot_x) west)) (imply (= wall_east 1) (true)) http://sra.itc.it/tools/mbp/NuPDDL.html nuPDDL—not yet a standard… • :weakgoal : it is required that the plan may reach the goal. • :stronggoal : it is required that every execution of the plan reaches the goal. • :strongcyclicgoal : it is required that every execution of the plan either reaches the goal, or at least always has a chance to do it • :postronggoal : it is required that every execution of the plan reaches the goal, using only the observations described in the domain • :conformantgoal : it is required that every execution of the plan reaches the goal, without ever observing • :ctlgoal : it is required that the CTL formula expressed as a goal is satisfied throughout every possible execution of the plan. Some examples of typical extended goals follow: • Do Reach p (``strong goal''): (af p) • Try Reach p (``weak goal''): (ef p) • Keep Trying Reach p (``strong cyclic goal''): (aw (ef p) p)
A Simple Progression Algorithm in the presence of pure sensing actions • Call the procedure Plan(BI,G,nil) where • Procedure Plan(B,G,P) • If B is a subset of BG (or any B’ in P that is marked “solved”) return P (propagate “solve” marking upwards) • Non-deterministically choose: • I. Non-deterministically choose a causative action a that is applicable in B. • Return Plan(a(B),G,P+a) • II. Non-deterministically choose a sensing action s that senses a formula f (could be a single state variable) • Let p’ = Plan(Bf,G,nil); p’’=Plan(B~f,G,nil) • /*Bf is the set of states of B in which f is true */ • Return P+(s?:p’;p’’) If we always pick I and never do II then we will produce conformant Plans (if we succeed).
Remarks on the progression with sensing actions • Progression is implicitly finding an AND subgraph of an AND/OR Graph • The amount of sensing done in the eventual solution plan is controlled by how often we pick step I vs. step II (if we always pick I, we get conformant solutions). • Progression is as clue-less as to whether to do sensing and which sensing to do, as it is about which causative action to apply • Need heuristic support
Cost models of conditional plans • The execution cost of a conditional plan is Cost of O5 + [Prob(p=T)* {cost of A1 + A3} + Prob(p=F)*{cost of A2 +A3} ] • Can take max(cost A1+A3; cost A2+A3 ) • The planning cost of a conditional plan is however is proportional to the total size of the plan (num actions) O5:p? Y N A1 A2 A3 O5:p? Y N A1 A2 Need to estimate cost of leaf belief states
Similar processing can be done for regression (PO planning is nothing but least-committed regression planning)
Sensing: General observations • Sensing can be thought in terms of • Speicific state variables whose values can be found • OR sensing actions (with preconditions and causative effects) that evaluate truth of some boolean formula over the state variables. • Sense(p) ; Sense(pV(q&r)) • A general action may have both causative effects and sensing effects • Sensing effect changes the agent’s knowledge, and not the world • Causative effect changes the world (and may give certain knowledge to the agent) • A pure sensing action only has sensing effects; a pure causative action only has causative effects. • The recent work on conditional planning has considered mostly simplistic sensing actions that have no preconditions and only have pure sensing effects. • When applied to a belief state, the sensing effects of an action wind up reducing the cardinality of that belief state • basically by removing all states that are not consistent with the sensed effects • Sensing actions introduce branches into the plans • If you apply Sense-A? to a belief state B, you get a partition of B: BA and B~A • You will have to make a plan for both the branches. And/Or search in the space of belief states
Sensing: More things under the mat • Sensing extends the notion of goals too. • Check if Rao is awake vs. Wake up Rao • Presents some tricky issues in terms of goal satisfaction…! • Handling quantified effects and preconditions in the presence of sensing actions • Rm* can satisfy the effect forall files remove(file); without KNOWING what are the files in the directory! • Sensing actions can have preconditions (as well as other causative effects) • The problem of OVER-SENSING (Sort of like the initial driver; also Sphexishness) [XII/Puccini project] • Handling over-sensing using local-closedworld assumptions • Listing a file doesn’t destroy your knowledge about the size of a file; but compressing it does. If you don’t recognize it, you will always be checking the size of the file after each and every action • A general action may have both causative effects and sensing effects • Sensing effect changes the agent’s knowledge, and not the world • Causative effect changes the world (and may give certain knowledge to the agent) • A pure sensing action only has sensing effects; a pure causative action only has causative effects. • The recent work on conditional planning has considered mostly simplistic sensing actions that have no preconditions and only have pure sensing effects. • Sensing has cost!
Sensing: Limited Contingency planning • In many real-world scenarios, having a plan that works in all contingencies is too hard • An idea is to make a plan for some of the contingencies; and monitor/Replan as necessary. • Qn: What contingencies should we plan for? • The ones that are most likely to occur…(need likelihoods) • Qn: What do we do if an unexpected contingency arises? • Monitor (the observable parts of the world) • When it goes out of expected world, replan starting from that state.