1 / 17

Hierarchical Mechanisms for Robot Programming

Hierarchical Mechanisms for Robot Programming. Shiraj Sen Stephen Hart Rod Grupen Laboratory for Perceptual Robotics University of Massachusetts Amherst May 30, 2008 NEMS ‘08. representation. programming. Action Potential functions Value functions. State representation.

andersonb
Download Presentation

Hierarchical Mechanisms for Robot Programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hierarchical Mechanisms for Robot Programming Shiraj Sen Stephen Hart Rod Grupen Laboratory for Perceptual Robotics University of Massachusetts Amherst May 30, 2008 NEMS ‘08

  2. representation programming Action Potential functions Value functions State representation reinforcement learning user defined intrinsic extrinsic Outline Hierarchical mechanisms for robot programming

  3. Hierarchical Actions programs greedy traversal avoids local minimum Φ ϕ closed loop primitive actions value functions potential fields force velocity references feedback signals G G G Σ Σ Σ H H H

  4. Primitive Action Programming Interface Sensory Error () • Visual (uref) • Tactile (fref) • Configuration variables (θref) • Operational Space(xref) Potential Functions () • Spring potential fields (ϕh) • Collision-free motion fields (ϕc) • Kinematic conditioning fields (ϕcond) Motor Variables () • Subsets of : • Configuration Variables • Operational Space Variables primitive actions: a = a1 a2 Nullspace Projection

  5. no reference () - convergence unknown 1 X descending gradient 0 State Representation • Discrete abstraction of action dynamics. • 4-level logic in control predicate pi

  6. Learn value functions using reinforcement learning - 1 X 0 Hierarchical Programming • A program is defined as a MDP over a vector of controller predicates: S= p1 … pN • Absorbing states in the value function capture “convergence” of programs.

  7. - - 1 1 X X 0 0 Intrinsic Reward • Goal: build deep control knowledge • Reward controllable interaction with the world • controllers with direct feedback from the external world. Catalog Track Touch Grasp Insert Stack convergence event

  8. Experimental Demonstration • Motor units • Two 7-DOF Barrett WAMs • Two 4-DOF Barrett Hands • 2-DOF pan/tilt stereo head • Sensory feedback • Visual • Hue • Saturation • Intensity • Texture • Tactile • 6-axis finger-tip F/T sensors • Proprioceptive Dexter

  9. Sst= psaccade ptrack  atrack atrack X 0 X X X 1 atrack 1 X X - asaccade asaccade 0 X STAGE 1: SaccadeTrack- 25 Learning Episodes Track-saturation rewarding action

  10. Srg= pst preach pgrab  STAGE 2: ReachGrab - 25 Learning Episodes Touch Track-saturation rewarding action

  11. STAGE 2: ReachGrab - 25 Learning Episodes Touch Track-saturation

  12. Svi= prg pcond ptrack(blue)  Track-blue STAGE 3: VisualInspect - 25 Learning Episodes Touch Track-saturation rewarding action

  13. Track-blue STAGE 3: VisualInspect - 25 Learning Episodes Touch Track-saturation

  14. Sgrasp= prg pmoment pforce  Grasp Track-blue STAGE 4: Grasp – User Defined Reward Touch Track-saturation X - - X 1 0 X X X X 1 1 X 0 0 X 0 1 ReachGrab amoment aforce rewarding action - 1 X X X 1 0

  15. X 0 0 X 0 - - X 1 1 1 X X X 1 0 1 X X X X atransport amoment 0 Grasp X - - STAGE 5: PickAndPlace – User Defined Reward Spnp= pg ptransport pmoment  rewarding action

  16. Conclusions • Mechanisms for creating hierarchical programs. • recursive formulation of potential functions and value functions. • control theoretic representation for action, state, and intrinsicreward. • Experimental demonstration of programming manipulation skills using staged learning episodes. • Intrinsic reward pushes out new behavior and models the affordances of objects.

  17. Thank You

More Related