Capturing knowledge about the instances behaviour in probabilistic domains

Capturing knowledge about the instances behaviour in probabilistic domains Sergio Jiménez Celorrio, Fernando Fernández, Daniel Borrajo Departamento de Informática Universidad Carlos III de Madrid

Outline • Motivation • The System • Experiments • Conclusions

Motivation • Planning in Probabilistic domains

Motivation • Planning in Probabilistic domains • Without having a Probabilistic representation of the domain

Motivation • Planning in Probabilistic domains • Without having a Probabilistic representation of the domain • Acquiring Probabilistic Information automatically • Repeating cycles of planning, execution and learning

Motivation • Planning in Probabilistic domains • Without having a Probabilistic representation of the domain • Acquiring Probabilistic Information automatically • Repeating cycles of planning, execution and learning • Using the Probabilistic Information • Generating Control Knowledge

The Planning Execution Learning Architecture Plan Executor Action (ai) Deterministic Domain Real World IPSS Planner (a1,a2,..,an) State (si) New problem Problem Execution Information (ai,si,s’i,gi) Control Knowledge Update Learning Robustness Table

The Planning Execution Learning Architecture Plan Executor Action (a1) Deterministic Domain Real World IPSS Planner (a1,a2,..,an) State (si) New problem Problem Execution Information (ai,si,s’i,gi) Control Knowledge Update Learning Robustness Table

The Planning Execution Learning Architecture Plan Executor Action (ai) Deterministic Domain Real World IPSS Planner (a1,a2,..,an) State (s1) New problem Problem Execution Information (ai,si,s’i,gi) Control Knowledge Update Learning Robustness Table

The Planning Execution Learning Architecture State1 Pick-up C B A

The Planning Execution Learning Architecture State1 State2 Pick-up C success C B B A A

The Planning Execution Learning Architecture State1 Pick-up C B A

The Planning Execution Learning Architecture State1 State2 Pick-up failure C B B C A A

The Robustness Table

Updating The Robustness Table pick-up-block-from (block0 block1) - Success

Updating The Robustness Table pick-up-block-from (block0 block1) - Failure

The Control Knowledge (control-rule prefer-operator-for-holding (if (and (current-goal (holding <robot> <block>)) (candidate-operator <op1>) (candidate-operator <op2>) (robustness-op-more-than <op1> <op2>))) (then prefer operator <op1> <op2>))

The Control Knowledge (control-rule prefer-bindings-for-put-down-block-on-for-on-top-of (if (and (current-goal (on-top-of <block-1> <object-1>)) (current-operator put-down-block-on) (candidate-bindings ((<robot> . <robot-1>) (<top> . <block-1>) (<bottom> . <object-1>))) (candidate-bindings ((<robot> . <robot-2>) (<top> . <block-1>) (<bottom> . <object-1>))) (robustness-bindings-more-than put-down-block-on1 ((<robot> . <robot-1>) (<top> . <block-1>) (<bottom> . <object-1>)) ((<robot> . <robot-2>) (<top> . <block-1>) (<bottom> . <object-1>))))) (then prefer bindings ((<robot> . <robot-1>) (<top> . <block-1>) (<bottom> . <object-1>)) ((<robot> . <robot-2>) (<top> . <block-1>) (<bottom> . <object-1>))))

The Experiments Probabilistic Domain Action (ai) IPC4-Simulator State (si)

The Experiments Probabilistic Domain Plan Executor Action (ai) Deterministic Domain IPC4-Simulator IPSS Planner (a1,a2,..,an) State (si) New problem Problem Execution Information (ai,si,s’i,gi) Update Learning Robustness Table

The Experiments C A B

The Experiments Probabilistic Domain Plan Executor Action (ai) Deterministic Domain IPC4-Simulator IPSS Planner (a1,a2,..,an) State (si) New problem Problem Execution Information (ai,si,s’i,gi) Update Learning Robustness Table

The Experiments

The Experiments • 5 blocks • 8 blocks • 11 blocks

Conclusions • Current Work • Capturing instances probabilistic behaviour • Generating Control Knowledge

Conclusions • Current Work • Capturing instances probabilistic behaviour • Generating Control Knowledge • Future Work • Capturing State dependant behaviour • Generating State dependant Control Knowledge

The Robustness Table

State Dependant Control Knowledge (control-rule prefer-operator-for-holding (if (and (current-goal (holding <robot> <block>)) (true-in-state (<state>)) (candidate-operator <op1>) (candidate-operator <op2>) (robustness-op-more-than <state><op1> <op2>))) (then prefer operator <op1> <op2>))

Conclusions • Current Work • Capturing instances probabilistic behaviour • Generating Control Knowledge • Future Work • Capturing State dependant behaviour • Generating State dependant Control Knowledge • Escalation Problem • Robustness Table Size

Capturing knowledge about the instances behaviour in probabilistic domains