S.T.A.N.F.O.R.D. Learning: Helicopter Dynamics Modeling with Accelerations

S T A N F O R D Models in Prior Work • Predict velocities and angular rates: • f: learned from data. • Obtain position and orientation from numerical integration. Shortcomings • From physics we have: • Body coordinate frame is different at every time step. This makes inertia highly non-linear in the state and very difficult to capture/learn from data. • For most physical systems, forces and torques have a fairly simple relation to inputs and current state. This simplicity is lost by the change of coordinate frame. First Autonomous Funnel • Aerobatic maneuver. • Method: model-based reinforcement learning. • Simulator: • Acceleration prediction. • Longer time-scale criterion. • Acknowledgments: control is joint work with Adam Coates, Ben Tse. (Paper forthcoming.) Rotation between body coordinate frames at times t and t+1 Accelerations Video available. Simulator Accuracy • Our acceleration prediction model • Predict accelerations: • f: learned from data. • Obtain velocity, angular rates, position and orientation from numerical integration. Advantages • No need to learn inertia from data. Constraints from physics are incorporated explicitly. • The relation between state, inputs and accelerations is not cluttered by the change of coordinate frame, and thus easier to learn from data. • Standard learning criteria • Frequency domain fitting: requires a linear model, used in CIFER (industry standard). • Minimize one-step prediction error: • For f linear in state s and inputs u: f can be found by linear regression. Longer time-scale criterion • Accuracy of simulation over longer time-scales is important for control. The following longer time-scale criterion was suggested in [Abbeel & Ng, 2004]: (H: time-scale of interest) • EM-algorithm for maximization is expensive in our continuous state-action space setting. We present a simple and fast algorithm for (approximately) minimizing the average squared error over a certain duration. • Sketch of algorithmic idea (see paper for full algorithm) • Model: • One step prediction at time t: • One step prediction at time t+1: • Two step prediction at time t: • Therefore, can approximate multiple-step dynamics by linear combination of one-step dynamics. • Our algorithm iterates the following two steps: • Compute estimate of st+1 given st, ut, ut+1 for current model A,B. • Estimate Bergen Industrial Twin XCell Tempest • Observations • Acceleration prediction model significantly better. Reasons: • Captures gravity exactly. • Captures inertia, thus side-slip effects in the data. • Longer time scale criterion outperforms CIFER, which in turn outperforms the one-step criterion. • Differences more significant for Tempest than for Bergen, since Bergen data is mostly around hover. Legend Linear model, one-step prediction error. Linear model, frequency domain fit with CIFER. Linear model, longer time scale prediction error. Acceleration model, one-step prediction error. Acceleration model, longer time scale prediction error. Helicopter State and Inputs • 12-D state: • 8-D state: • u1, u2: The longitudinal (front-back) and latitudinal (left-right) cyclic pitch controls cause the helicopter to pitch forward/backward or roll sideways. • u3: The tail rotor collective pitch control affects tail rotor thrust, and can be used to yaw (turn) the helicopter. • u4: The main rotor collective pitch control affects the main rotor's thrust. Orientation: roll, pitch, yaw Angular rates Position Velocity Encode symmetries using body (=robot-centric) coordinates Body coordinate frame attached to helicopter S T A N F O R D Learning Vehicular Dynamics, with Application to Modeling Helicopters Overview • Model-based reinforcement learning has been very successful. • State-of-the-art: • Reinforcement learning returns policies that fly well in simulation. • Remaining helicopter failures typically caused by inaccurate simulation. • Key technical challenge: Building an accurate simulator. • Our approach: • Encode all constraints known from physics. (Gravity, inertia, etc.) Learn only parts of model not determined by physics. • Explicitly learn simulation that is predictive at long time-scales. • Result • Significantly improved helicopter model. • First autonomous funnel (aerobatic maneuver) using our model. RC Helicopters Bergen Industrial Twin Pieter Abbeel, Varun Ganapathi, Andrew Y. Ng XCell Tempest Conclusion • Key technical challenge for model-based reinforcement learning applied to helicopters: building an accurate simulator. • Our approach • By using acceleration-based approach, we can encode all constraints known from physics. (Gravity, inertia, etc.) Learn only parts of model not determined by physics. • Explicitly learn simulation that is predictive at long time-scales. • Result • Significantly improved helicopter model. • First autonomous funnel (aerobatic maneuver) using our model.

S.T.A.N.F.O.R.D. Learning: Helicopter Dynamics Modeling with Accelerations

S.T.A.N.F.O.R.D. Learning: Helicopter Dynamics Modeling with Accelerations

Presentation Transcript

EMS HELICOPTERS

Modeling Geomagnetic Storm Dynamics

Model Validation and Construction with Application to Recovery Modeling

Rapid Business Application Development with Dynamics CRM

Energy ITS：Another Application of Vehicular Communication

Reinforcement Learning in Quadrotor Helicopters

Helicopters

Learning Spectral Clustering, With Application to Speech Separation

Modeling Beam Ion Relaxation with application to DIII-D

Semantic learning with specific application to ontologies

Helicopters

Helicopters The

Causal Models, Learning Algorithms and their Application to Performance Modeling

Introduction to Modeling Fluid Dynamics

Modeling Population Dynamics

Helicopters

MILITARY HELICOPTERS

DYNAMICS LEARNING WITH KINETIC CONNECTIONS

LEARNING DYNAMICS

Helicopters