Learning HTN Method Preconditions and Action Models from Partial Observations

Learning HTN Method Preconditions and Action Models from Partial Observations Hankz Hankui Zhuoa, Derek Hao Hua, Chad Hoggb, Qiang Yanga and Hector Munoz-Avilab a: Hong Kong University of Science & Technology, b: Lehigh University

Motivation • Process planning Workflow Applicability conditions? To simultaneously learn • Applicability conditions • for method • Preconditionsand effects (action models)

HTN Planning Problem • A HTN planning problem: (s0, T, M, A) • s0: an initial state • T: a list of tasks • M: a list of methods • A: a list of actions • A solution: an HTN decomposing T into actions from s0, (i.e., a decomposition tree), e.g. (name, t, PRE, SUBTASKS) (o, PRE, ADD, DEL)

Learning Problem • Our learning problem: • Input: a set of decomposition trees with partial observations, e.g. • Output: action models and method preconditions.

Decomposition trees HTN schemata Build constraints State constraints Decomposition constraints Action constraints Solve constraints HTN model Algorithm: HTN-learner Names, parameters, Tasks’ relations Relation information between States and methods/actions Relation information between methods and actions Constraints imposed on action models Solving constraints using weighted MAXSAT

Decomposition trees HTN schemata Build constraints State constraints Decomposition constraints Action constraints Solve constraints HTN model HTN schemata task • Predicate list: • E.g. {(on ?x ?y), …} • Action schema list: • E.g. {(pick-up ?x), …} • Method-structure list: • E.g. { • Method:<(makestack_from_table_iter ?x ?y ?z), • task:(stack_from_table ?x ?y) • subtasks: ((clean ?x ?z) (pick-up ?x) (stack ?x ?y))>; • Preconditions: nil … • } method subtasks

Decomposition trees HTN schemata Build constraints State constraints Decomposition constraints Action constraints Solve constraints HTN model State constraints Generally, if p frequently appears before a, it is probably a precondition of a. Specifically speaking, The weights of all the constraints are calculated by counting their occurrences in all the decomposition trees.

Decomposition trees HTN schemata Build constraints State constraints Decomposition constraints Action constraints Solve constraints HTN model Decomposition constraints • Generally, a subtask may provide a condition for its subsequent subtasks. • Then, if the relation p between a and m is as shown in the figure, we have: method p The weights of all the constraints are also calculated by counting their occurrences. action

Decomposition trees HTN schemata Build constraints State constraints Decomposition constraints Action constraints Solve constraints HTN model Action constraints • Action constraints are imposed to ensure the learned action models are succinct, which is • Their constraints are associated with the maximal weight of all the state constraints and decomposition constraints.

Decomposition trees HTN schemata Build constraints State constraints Decomposition constraints Action constraints Solve constraints HTN model Solve constraints • To see the relative importance of three kinds of constraints, we introduce three new parameters to control weights of each kind of constraints: • The weight wi of the ith kind of constraints will be replaced by . • Solve these weighted constraints by running a weighted MAXSAT solver. • The attained result is converted to HTN models, e.g.

Experimental Result • Example result: • (:method makestack_from_table_iter (?x - block ?y - block ?z - block) :task (stack_from_table ?x - block ?y - block) :preconditions (and (ontable ?x) (clear ?z) (clear ?y) (on ?z ?x) (holding ?z)) :subtasks (and (clean ?x ?z) (pick-up ?x) (stack ?x ?y)) • Other methods … • (:action pick-up (?x - block) :precondition (and (clear ?x)(ontable ?x) (handempty)) :effect (and (not (ontable ?x)) (not (clear ?x))(clear ?x) (not (handempty))(handempty)(holding ?x))) • Other Action models … Extra condition Missing condition

Experimental Result • We compared HTN-learner with ARMS+, which first uses ARMS to learn the action models and then uses the method-based constraints to learn the method preconditions. ARMS+ > HTN-learner Errors decrease when observations increase

Experimental Result Error first decreases and then increases when increases. That means could not be too small or too large. i.e., three kinds of constraints are all needed.

Cpu Time • We test the running time with respect to different number of decomposition trees, and find that it is polynomial, by fitting the time of htn-driverlog.

Related works b: Hogg et al. 2008: learn structures for new tasks c: Ilghami et al., 2005; Xu and Munoz-Avila, 2005: learns preconditions for given methods d: Nejati et al., 2006; Reddy and Tadepalli, 1997: learn structures for goals

Conclusion • Simultaneously solving the constraints reduces error in the learned HTN-model compared to first learning the action models and then learning the method’s preconditions, • By imposing the three kinds of constraints, we can effectively reduce the hypothesis space of possible action models and method preconditions while using the input plan traces as an empirical basis. • The running time of our algorithm increases polynomially and the error rate decreases with the size of the input.

Thank You&Questions?

Learning HTN Method Preconditions and Action Models from Partial Observations

Learning HTN Method Preconditions and Action Models from Partial Observations

Presentation Transcript

Learning From Observations

Learning from Observations

Learning From Observations

Learning From Observations

Learning from spectropolarimetric observations

From observations to models:

Learning From Observations

Learning from Observations

Learning from Observations

Learning From Observations

Learning from Observations

Learning Applicability Conditions in AI Planning from Partial Observations

Learning from Observations

From observations to models:

Learning From Models

Learning from Observations

Learning from Action

Learning from observations (b)

Learning from observations

Learning from observations (b)

Learning from Observations

Learning from Observations