230 likes | 395 Views
Learning Parameterized Maneuvers for Autonomous Helicopter Flight. Jie Tang, Arjun Singh, Nimbus Goehausen , Pieter Abbeel UC Berkeley. Overview. Dynamics Model. Controller. Optimal Control. Target Trajectory. Problem. Robotics tasks involve complex trajectories Stall turn
E N D
Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley
Overview Dynamics Model Controller Optimal Control Target Trajectory
Problem • Robotics tasks involve complex trajectories • Stall turn • Challenging, nonlinear dynamics
Overview Demonstrations Dynamics Model Controller Optimal Control Target Trajectory
Learning Target Trajectory From Demonstration Problem: Demonstrations are suboptimal • Use multiple demonstrations • Current state of the art in helicopter aerobatics (Coates, Abbeel, and Ng, ICML 2008) Problem: Demonstrations will be different from desired target trajectory • Our work: learn parameterized maneuver classes Height
Learning Trajectory Height 50m Hidden • HMM-like generative model • Dynamics model used as HMM transition model • Synthetic observations enforce parameterization • Demos are observations of hidden trajectory • Problem: how do we align observations to hidden trajectory? Demo 1 Demo 2
Learning Trajectory Height 50m Hidden • Dynamic Time Warping • Extended Kalman filter / smoother • Repeat Demo 1 Demo 2
Smoothed Dynamic Time Warping • Potential outcome of dynamic time warping: • More desirable outcome: • Introduce smoothing penalty • Extra dimension in dynamic program
Weighting Demonstrations • Some demonstrations should contribute more to target trajectory than others • Difficult to tune these observation covariances • Learn optimal observation covariances using EM TargetHeight
Learned Trajectory TargetHeight
Frequency Sweeps and Step Responses Overview Demonstrations Dynamics Model Controller Optimal Control Target Trajectory
Learning dynamics • Standard helicopter dynamics model estimated from data • Has relatively large errors in aggressive flight regimes • After learning target trajectory, we obtain aligned demonstrations • Errors in model are consistent for executions of the same maneuver class • Many hidden variables are not modeled explicitly • Airflow, rotor speed, actuator latency • Learn corrections to dynamics model along each target trajectory 2G error
Frequency Sweeps and Step Responses Overview Dynamics Model Demonstrations Trajectory-Specific Corrections Standard Dynamics Model + Controller Optimal Control Optimal Control Receding Horizon Differential Dynamic Programming Target Trajectory
Experimental Setup Extended Kalman Filter RHDDP controller “Position” 3-axis magnetometer, accelerometer, gyroscope (“Orientation”) Offboard Cameras 1280x960@20Hz Controls @ 20Hz Onboard IMU @333Hz
Results: Stall Turn Max speed: 57 mph
Quantitative Evaluation • Flight conditions: wind up to 15mph • Similar accuracy is maintained for queries very different from our demonstrations • e.g., can learn 60m stall turns from 40m, 80m demonstrations • Four or five demonstrations sufficient to cover a wide range of stall turns, loops, and tic-tocs • e.g., four stall turns at 20m, 40m, 60m, 80m sufficient to generate any stall turn between 20m and 80m
Conclusions • Presented an algorithm for learning parameterized target trajectories and accurate dynamics models from demonstrations • With few demonstrations, can generate a wide variety of novel trajectories • Validated on a variety of parameterized aerobatic helicopter maneuvers