Using OpenRDK to learn walk parameters for the Humanoid Robot NAO

UsingOpenRDKtolearn walk parameters fortheHumanoid Robot NAO it’s me F. Giannone A. Cherubini L. Iocchi M. Lombardo G. Oriolo

Overview:environment Robotic Agent • Humanoid Robot NAO • Produced by Aldebaran Application Robotic Soccer SDK Simulator

Overview:(sub)tasks At First !!! Modelling Module Vision Module Elaborate raw data to obtain more reliable information Process raw data from environment Environment Actuate robot motors accordindly Decide the best behaviour to accomplish the agent goal Behaviour Control Module Motion Control Module At First !!!

Make Nao walk…how? For these reasonswe decided to develop our walk model and to tune it using machine learnig tecniques • called through an interface(NaoQi Motion Proxy) Nao is equipped with a set of motion utilities including a walk implementation that can be • partially customized by tuningsome parameters Main Advantage • Ready to Use (…to be tuned) …and a Drawback • Based on an unknow Walk Model No flexibility at all!!!

SPQR Walking library development workflow SPQR Walk Model Test the walk model on Webots simulator Develop the Walk model using Matlab Design and Implement a C++ library for our RDK Soccer Agent on Webots simulator SPQR Walking Library Test our Walking RDK Agent on real NAO robot Finally tunewalk parameters (on webots simulator and on NAO)

A simple walking RAgent for Nao Switches between two states: walk - stand Simple Behaviour Module Motion Control Module SPQR Walking Library uses NaoQi Adaptor Webots Client Smemy TCP channel NAO (NaoQi) WEBOTS

SPQR Walking Engine Model 21 degrees of freedom NAO model characteristics No actuated trunk No dynamic model available We follow the “Static Walking Pattern”: Use a-priori definition of the desired trajectories defined by: Choose a set of variable output: 3D coordinates of selected pointsof the robot Choose and parametrize the desiredtrajectories for these variables at each phase of the gait • Velocity Commands (v,ω) • v is linear velocity • ω is angolar velocity

SPQR velocity commands (v,0) (v,0) Initial Half Step Rectilinear Walk Swing Stand Position Behavior Control Module (v,ω) (0,0) (v,ω) (v,ω) Curvilinear Walk Swing Motion Control Module (0,ω) Turn Step (0,0) Joints Matrix (0, ω) Final Half Step

SPQR walking subtasks and parameters Biped walking Swing phase Double support phase SS% SPQR walk subtasks Foot trajectories inthe xz plane Arm control Hip yaw/pitchcontrol (turn) Center of masstrajectory in lateraldirection Xtot, Xsw0, XdsZst, Zsw Ks Hyp Yft, Yss, Yds, Kr

Walk tuning: main issues • Possible choices • By hand • By using machine learning techniques • Machine Learning seems the best solution • Less human interaction • Explores the search space in a more systematic way • …but take care of some aspects • You need to define an effective fitness function • You need to choose the right algorithm to explore the parameter space • Only a limited amount of experiments can be done on a real robot

Webots Real Nao SPQR Learning System Architecture Learning library uses Learner Iterationexperiments Fitness (GPS) RAgent Datato evaluatethe fitness uses Walking library

SPQR Learner Learner Policy Gradient(e.g., PGPR) Firstiteration? No Apply the chosenalgorithm (strategy) Nelder MeadSimplex Method Genetic Algorithm Yes Return initial Iteration and iteration information Return next Iteration and iteration information

Policy Gradient (PG) iteration *=   normalized() p’=p+* Given a point p inthe parameter space  IRK Generate n (n=mk) policiesfrom p (for each componentof p: pi ,pi+, or pi-) For each k  {1, …, K}, if F0 > F+ and F0 > F- then k=0 else k= F+ -F- For each k  {1, …, K}, compute Fk+, Fk0, Fk- Evaluate the policies

Enhancing PG: PGPR At each iteration i, the gradient estimate (i)can be used to obtain a metric for measuring the relevance of the parameters. forgetting factor Given the relevance and a threshold T, PGPR prunes less relevant parametersin next iterations.

Curvilinear biped walking experiment • The robot move along a curve with radius R for a time t Fitness function: In which: path length radial error

Simulators in learning tasks • Advantages • You can test the gait model and the learning algorithm without being biased by noise • Limits • The results of the experiments on the simulator can be ported on the real robot, but specialized solutions for the simulated model can be not so effective on the real robot (e.g., it does not take into account asymmetries, models are not very accurate)

Results (1) • Five sessions of PG, 20 iterations each, all starting from the same initial configuration • SS%, Ks, Yft have been set to hand-tuned values • 16 policies for each iteration • Fitness increasesin a regular way • Low varianceamong the fivesimulations

Results (2) Final parameter setsfor the five PG runs Zsw Xsw0 Xs Kr Five runs of PGPR

Bibliography • A. Cherubini, F. Giannone, L. Iocchi, M. Lombardo, G. Oriolo. “Policy Gradient Learning for a Humanoid Soccer Robot”. Accepted for Journal of Robotics and Autonomous Systems. • A. Cherubini, F. Giannone, L. Iocchi, and P. F. Palamara, “An extended policy gradient algorithm for robot task learning”, Proc. of IEEE/RSJ International Conference on Intelligent Robots and System, 2007. • A. Cherubini, F. Giannone, and L. Iocchi, “Layered learning for a soccer legged robot helped with a 3D simulator”, Proc. of 11th International Robocup Symposium, 2007. • http://openrdk.sourceforge.net • http://www.aldebaran-robotics.com/ • http://spqr.dis.uniroma1.it

??? ??? Any Questions ??? ???

Using OpenRDK to learn walk parameters for the Humanoid Robot NAO

Using OpenRDK to learn walk parameters for the Humanoid Robot NAO

Presentation Transcript

Humanoid Robot Head

High-speed Pressure Sensor Grid for Humanoid Robot Foot

INTRODUCTION TO MOTION GENERATION FOR NAO ROBOT USING ANDROID FRAMEWORK

Introduce about sensor using in Robot NAO

Humanoid Robot Head: May 09-11

Path Planning with the humanoid robot iCub

Control of a humanoid robot using EEG

Humanoid Robot Head

RoboSapien Based Autonomous Humanoid Robot

Simulation Tools for a Real Humanoid Robot

Using Parameters

Design of a Reconfigurable Humanoid Robot

Improved Goalie Strategy with the Aldebaran Nao humanoid Robots*

The Humanoid Robot

Humanoid Robot (robot-android): a Servant or a Clone?

Rising Demand of Humanoid Robots for Personal Assistance in Retail to Fuel the Growth of the Global Humanoid Robot Marke

Balance control of humanoid robot for Hurosot

Nao Robot