Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting

King Fahd University of Petroleum and Minerals COE 584/484: Robotics Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting Muhammad Al-Nasser Mohammad Shahab March 2008 COE584: Robotics

Outline • Problem Definition • Physical Description • Humanoid Walking System • Feedback • Gyroscope • Phase Resetting • Stochastic Optimization • PGRL • Experimentation • Comments

Problem Definition • Authors • Felix Faber & Sven Behnke, Univ. of Freinbrg, Germany • Problem Statement: • “to optimize the walking pattern of a humanoid robot for forward speed using suitable metaheuristics”

First Humanoid Robot! 1206 AD Ibn Ismail Ibn al-Razzaz Al-Jazari A boat with four programmable automatic musicians that floated on a lake to entertain guests at royal drinking parties!!

Problem Definition Sensor Noise: Camera Gyroscope Ultrasonic Force … Inaccurate Actuators: Motors … Environment Disturbances: Unknown surface … Nonlinear Dynamics: i.e. complex system to control Problems?

Physical Description • Jupp, team NimbRo • 60 cm, 2.3 kg • Pocket PC

Physical Description • Pitch joint to bend trunk • Each leg • 3DOF hip • Knee • 2DOF ankle • Each arm • 2DOF shoulders • elbow

Humanoid Walking System Joints motor positions Controller Robot walks! Leg Motion Trajectory ’s • One Approach • Model-Based (Geometric Model) • Accurate Model • Solving motion equations for all joints (offline) • 19 Degrees of Freedom • Nonlinear model equations • Computational complexity

Humanoid Walking System Joints motor positions Controller ’s • Central Pattern Generators (CPG) • Sinusoid joint trajectory generated • Bio-Inspired • no need for model 2nd Approach

Humanoid Walking System • Open-loop (no feedback) Gait • Mechanism • Shifting weight from one leg to the other • Shortening the leg not needed • Leg motion in forward direction

Humanoid Walking System   time - • Open-loop Gait • Clock-driven, Trunk phase being central clock • Trunk Phase (with ‘foot step frequency’  ) • Right leg motion phase =Trunk + /2 • Left leg motion phase = Trunk - /2

Humanoid Walking System Leg Left Kinematic Mapping Right  Swing Foot “Human-Like Walking using Toes Joint and Straight Stance Leg” by Behnke  Is leg extension Swingis leg swing amplitude r: Roll p: Pitch y: Yaw (continued)

Feedback Joints motor positions Mapping ’s Controller Gyroscope: Gyro = Inclination (Balance) Angular Velocity Force Sensing Resistors: foot touch ground trigger (‘High’ or ‘Low’) Overall Control System

Feedback • Gyroscope • device for measuring orientation, based on the principles of conservation of angular momentum • Remember Physics 101!

Feedback Joints motor positions ’s Gyro • P-Control • Gyro increase = robot fall • Proportional Control • reactive action proportionate to ‘error’ (Error = sensor value – desired value) • Desired values = zero (i.e. no inclination) • Other: Proportional-Integral Control • action proportionate to ‘error’ and proportionate to accumulation of ‘error’

Feedback Joints motor positions Mapping ’s P-Control Overall System

Feedback Joints motor positions Controller ’s Online Adaptation (Stochastic Optimization) • Adaptive Control • Online tuning of ‘parameters’ of the controller Overall System

Stochastic Optimization Approach • Goal: • Adjust parameters to achieve faster and more stable walk. • Fitness function (cost function) is used to express optimization goals (i.e. speed & robustness) f (.): RN--->R N: number of parameters of interest

Stochastic Optimization Approach • The parameters are Kinematic Mapping (Behnke paper)

Stochastic Optimization Approach • We evaluate f in a given set of parameters • x = [x1 , x2 , ... , xN] (Table 1) • Now, how to find the values of the parameters that will result in the highest fitness value? • use a metaheuristic method called PGRL ? +1 d <dexp

Policy Gradient Reinforcement Learning (PGRL) • An optimization method to maximize the walking speed • It automatically searches a set of possible parameters aiming to find the fastest walk that can be achieved

Policy Gradient Reinforcement Learning • How dose PGRL work? 1st: generates randomly B test polices {x1, x2,…, xB} • around an initially given set of parameter vector xπ • (where x = [x1 , x2 , … , xN]) • Each parameter in a given test policy xi is randomly set to • where 1≤i ≤B and 1 ≤j ≤N • ε is a small constant value

Policy Gradient Reinforcement Learning • 2nd: • the test policy is evaluated by ‘fitness function’. • For each parameter j is grouped into 3 categories • Which are • depending on where the jth parameter is modified by –ε, 0, +ε

Policy Gradient Reinforcement Learning • Next 3rd , construct vector a=[a1, a2, …, aN] • As are average of each category

Policy Gradient Reinforcement Learning • Then 4th(finally), adjust xπas follows where η is a scalar step size

Extension to PRLG • Adaptive step size after g steps: where s: the number of fitness functions evaluations S: maximum allowed number of s

Overall • Overall System Joints motor positions Controller ’s xπ PGRL

Experiment

Results

Results After 1000 iteration Initial • speed is 21.3 cm/s • fitness is 1.36 • Speed is 34.0 cm/s • Fitness is 1.52 60%

Parameters

Glossary • Stance leg: • the leg which is on the floor during the walk. • Swing leg: • the leg which moving during the walk. • Single support: • The case where robot is touching the floor with one leg. • Double support: • The case where robot is touching the floor with both legs.

Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting

Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting

Presentation Transcript

Stochastic Optimization ESI 6912

Stochastic optimization and control for Energy Management

Porting phase optimization and productivity maximization

Stochastic optimization of service provision with selfish users

Stochastic optimization of a timetable

Stochastic optimization of energy systems

Description and Optimization of Beta Feedback

Approximation Algorithms for Stochastic Optimization

Resetting FSMs

Market Optimization Phase VI

Stochastic Optimization in Electricity Systems

Stochastic Optimization and Simulated Annealing

Phase Referencing Optimization

Optimization of adiabatic buncher and phase rotator

Safe Execution of Bipedal Walking Tasks from Biomechanical Principles

Federated Searching Feedback: Walking the Talk?

Model-Free Stochastic Perturbative Adaptation and Optimization

Safe Execution of Bipedal Walking Tasks from Biomechanical Principles

Stochastic optimization of energy systems

Optimization of Phase Contrast Imaging

Status of development of stochastic cooling and feedback systems of the Collider