Robust Belief-based Execution of Manipulation Programs

Robust Belief-based Execution ofManipulation Programs Kaijen Hsiao Tomás Lozano-Pérez Leslie Pack Kaelbling MIT CSAIL

Achieving Goals under Uncertainty • Two kinds of uncertainty: • current state: • need to plan in information space • results of future actions: • search branches on outcomes as well as actions • Choice of action must be dependent on current information state

Discrete POMDP Formulation • states • actions • observations • transition model • observation model • reward

POMDP Controller Controller • State estimation is discrete Bayesian filter • Policy maps belief states to actions belief  SE sensing action Environment

Action selection in POMDPs • Off-line optimal policy generation • Intractable for large spaces • On-line search: finite-depth expansion of belief-space tree from current belief state to select single action • Tractable in broad subclass of problems

Challenges for action selection • Continuous state spaces • Requirement to select action for any belief state • Long horizon • Action branching factor • Outcome branching factor • Computationally complex observation and transition models

Grasping in uncluttered environments • Points of leverage: • Robot pose is approximately observable • Robot dynamics are nearly deterministic • Bounded uncertainty over unobserved object parameters • Room to maneuver

Online belief-space search • Continuous state space: discretize object state space

Discretize object configuration space workspace configuration space belief state

Online belief-space search • Continuous state space: discretize object state space • Action for any belief: search forward from current belief state

Search forward from current belief • Low entropy belief states enable reliable grasp • Use entropy as static evaluation function at leaves • Actions can be useful for information gathering

Online belief-space search • Continuous state space: discretize object state space • Action for any belief: search forward from current belief state • Long horizon: use temporally extended actions

Use temporally extended actions • Primitive actions Entire trajectories • Reduce horizon Observations at end

Online belief-space search • Continuous state space: discretize object state space • Action for any belief: search forward from current belief state • Long horizon: use temporally extended actions • Large action branching factor: parameterize small set of action types by current belief

Parameterize actions with belief • Actions are entire world-relative trajectories • In current belief state, • execute with respect to most likely object configuration • terminate on contact or end of trajectory

Online belief-space search • Continuous state space: discretize object state space • Action for any belief: search forward from current belief state • Long horizon: use temporally extended actions • Large action branching factor: parameterize small set of action types by current belief • Computationally complex observation and transition models: precompute models

Precompute models • Execute WRT • with respect to estimated state e • in world state w • Expected observation,transition • Based on geometric simulation

Online belief-space search • Continuous state space: discretize object state space • Action for any belief: search forward from current belief state • Long horizon: use temporally extended actions • Large action branching factor: parameterize small set of action types by current belief • Computationally complex observation and transition models: precompute models • Large observation branching factor: canonicalize observations for each discrete state and action

Canonicalize observations • Any (e, w) pair with same relative transformation has same world-relative outcomes and observations • Only sample for one e with w varying within initial range of uncertainty • Cluster observations and represent each bin of object configurations by a single representative one • Only branch on canonical observations

Algorithm • Off-line: • plan WRTs for grasping and info gathering • compute models • On-line: • while current belief state doesn’t satisfy goal • compute expected info gain of each WRT • execute best WRT until termination • use observation to update current belief • return to initial pose • execute final grasp trajectory

Application to grasping with simulated robot arm • Initial conditions (ultimately from vision) • Object shape is roughly known (contacted vertices should be within ~1 cm of actual positions) • Object is on table and pose (x, y, rotation) is roughly known (center of mass std ~5 cm, 30 deg) • Achieve specific grasp of object

Observations • Fingertips: 6-axis force/torque sensors • position • normal • Additional contact sensors: • just contact • Swept non-colliding path rules out poses that would have generated contact

Grasping a Box Most likely robot-relative position Where it actually is

Initial belief state

Summed over theta

Tried to move down; finger hit corner

Probability of contact observation at each location

Updated belief

Re-centered

Trying again, with new belief Back up Try again

Final state and observation Observation probabilities Grasp

Updated belief state: Success! Goal: variance < 1 cm x, 15 cm y, 6 deg theta

What if Y coord of grasp matters?

Need explicit information gathering

Simulation Experiments • Methods tested: • Single open-loop execution of goal-achieving WRT with respect to the most likely state • Repeated execution of goal-achieving WRT with respect to the most likely state • Online selection of information-gathering and goal-achieving grasps (1-step lookahead)

Box experiments • Allowed variation in goal grasp: 1 cm, 1 cm, 5 deg • Initial uncertainty: 5 cm, 5 cm, 30 deg

Cup experiments

Cup experiments • Goal 1 cm x, 1 cm y, rotation doesn’t matter (no info-grasps used) • Start uncertainty 30 deg theta (x,y varies) Increasing uncertainty

Grasping a Brita Pitcher Target grasp: Put one finger through the handle and grasp

Brita Pitcher experiments

Brita Pitcher results Increasing uncertainty

Other recent probabilistic approaches to manipulation • Off-line POMDP solution for grasping (Hsiao et al. 2007) • Bayesian state estimation using tactile sensors to locate object before grasping (Petrovskaya et al. 2006) • Finding a fixed trajectory that is most likely to succeed under uncertainty (Alterovitz et al. 2007, Burns and Brock 2007)

The End.

Timing For Brita Pitcher • (2.16 GHz processor, 3.24 GB RAM running Python, times in seconds)

Number of Actions Used

Creating Information-gain Trajectories • Trajectory generation • Generate endpoints, use randomized planner (such as OpenRAVE) to find nominal collision-free path • Sweep through entire workspace • Choose a small set based on information gain from start uncertainty

Robust Belief-based Execution of Manipulation Programs

Robust Belief-based Execution of Manipulation Programs

Presentation Transcript

Web of Belief

Are prediction markets robust against manipulation? A lab experiment

Campus-Based Programs

Atlantis: Robust, Extensible Execution Environments for Web Applications

Orchestrating the Execution of Stream Programs on Multicore Platforms

Robust Programs and Exception Handling

Evidenced Based Programs

Detection of Attacks with Proxy-based Execution

Compilation and Execution of C# Programs

Evidence Based Programs

Estimation of Assembly Programs Execution Time

Optimizations for Faster Execution of Esterel Programs

Software Pipelined Execution of Stream Programs on GPUs

Deterministic Execution of Nondeterministic Shared-Memory Programs

Creating Robust Manipulation Interactions with Imperfect Actuators and Sensors

BELIEF BASED CULTURE CHANGE

Campus Based Programs

Optimization Based Design of Robust Synthetic

Faith-Based Programs

Goal-directed Execution of Answer Set Programs

Are prediction markets robust against manipulation? A lab experiment