Automation & Robotics Research Institute (ARRI) The University of Texas at Arlington

F.L. Lewis, Fellow IEEE, Fellow U.K. Inst. Meas. & Control Moncrief-O’Donnell Endowed Chair Head, Controls & Sensors Group Automation & Robotics Research Institute (ARRI)The University of Texas at Arlington http://ARRI.uta.edu/acs Lewis@uta.edu

Bruno Borovic MEMS Modeling and Control With Ai Qun Liu, NTU Singapore Thermal Model Mechanical Model Electrical Model Optical Model FEA Experiment

Electrostatic Parallel Plate Actuator Use feedback linearization to solve the pull in instability problem Size of human hair 150 microns wide Micro Electro Mechanical Optical Systems - MEMS Electrostatic Comb Drive Actuator for Optical Switch Control Optical fibre

Objective: Develop new control algorithms for faster, more precise control of DoD and industrial systems Confront complex systems with backlash, deadzone, flexible modes, vibration, unknown disturbances and dynamics • Three patents on neural network learning control • Internet- Remote Site Control and Monitoring • Rigorous Proofs and Design Algorithms • Applications to Tank Gun Barrels, Vehicle Active Suspension, Industrial machine control • SBIR contracts for industry implementation • Numerous Books and Papers • Award-Winning PhD students Nonlinearlearning loops PID loop Nonlinear Intelligent Control Neural Network Learning Controllersfor Vibratory Systems- Gun barrel, HMMWV suspension Intelligent Control Tools

Relevance - Machine Feedback Control High-Speed Precision Motion Control with unmodeled dynamics, vibration suppression, disturbance rejection, friction compensation, deadzone/backlash control Industrial Machines Military Land Systems Vehicle Suspension Aerospace

The Perpetrators Aydin Yesildirek- Neural networks for control S. Jagannathan- DT NN control Sesh Commuri- CMAC, fuzzy Rafael Fierro- robotic nonholonomic systems. Hybrid systems Rastko Selmic- actuator nonlinearities Javier Campos- fuzzy control, actuator nonlinearities Ognjen Kuljaca- implementation of NN control

qd .. Nonlinear Inner Loop Feedforward Loop ^ f(x) q e r  [I] Robot System Kv v(t) Robust Control Term PD Tracking Loop Neural Network Robot Controller Feedback linearization Universal Approximation Property qd Problem- Nonlinear in the NN weights so that standard proof techniques do not work Easy to implement with a few more lines of code Learning feature allows for on-line updates to NN memory as dynamics change Handles unmodelled dynamics, disturbances, actuator problems such as friction NN universal basis property means no regression matrix is needed Nonlinear controller allows faster & more precise motion

Extension of Adaptive Control to nonlinear-in parameters systems No regression matrix needed Forward Prop term? Extra robustifying terms- Narendra’s e-mod extended to NLIP systems Backprop terms- Werbos Can also use simplified tuning- Hebbian

.. q d NN#1 q q q q = = . . r r ^ . e = F (x) r r q q 1 r r h u i e d K K K K h h h r q q . . d d q q = = q q d d d d ^ F (x) 2 NN#2 Backstepping Loop DT Systems- Jagannathan Backstepping Add an extra feedback loop Two NN needed Use passivity to show stability .. .. q q d d Nonlinear FB Linearization Loop Nonlinear FB Linearization Loop NN#1 q q e e q q = = . . r r ^ ^ . . e e = = F F (x) (x) r r q q e e 1 1 r r h u i r r Robot Robot e d L L 1/K 1/K [ [ I] I] K K System System B1 B1 r r i i q q . . d d q q = = q q d d d d Robust Control Robust Control ^ ^ F F (x) (x) Term Term v v (t) (t) 2 2 i i NN#2 Backstepping Loop Tracking Loop Tracking Loop Neural network backstepping controller for Flexible-Joint robot arm Advantages over traditional Backstepping- no regression functions needed DT backstepping- noncausal- Javier Campos patent

Estimate of Nonlinear y (n) Function d $ f ( x ) - t & L ] T [0 des Filter v 2 - - Backlash t x x j ˆ r e ˆ u Nonlinear d L I] T [ K 1/s t K - - System v b des - v - y 1 nn NN Compensator x r ˆ Z d F Backstepping loop Dynamic inversion NN compensator for system with Backlash U.S. patent- Selmic, Lewis, Calise, McFarland

Force Control Flexible pointing systems SBIR Contracts Vehicle active suspension Andy Lowe Scott Ikenaga Javier Campos

ARRI Research Roadmap in Neural Networks 3. Approximate Dynamic Programming – 2006- Nearly Optimal Control Based on recursive equation for the optimal value Usually Known system dynamics (except Q learning) The Goal – unknown dynamics On-line tuning Optimal Adaptive Control Extend adaptive control to yield OPTIMAL controllers. No canonical form needed. 2. Neural Network Solution of Optimal Design Equations – 2002-2006 Nearly optimal solution of controls design equations. No canonical form needed. Nearly Optimal Control Based on HJ Optimal Design Equations Known system dynamics Preliminary Off-line tuning 1. Neural Networks for Feedback Control – 1995-2002 Extended adaptive control to NLIP systems No regression matrix Based on FB Control Approach Unknown system dynamics On-line tuning NN- FB lin., sing. pert., backstepping, force control, dynamic inversion, etc.

Performance output disturbance z d control Measured output y u H-Infinity Control Using Neural Networks Murad Abu Khalaf System where L2 Gain Problem Find control u(t) so that For all L2 disturbances And a prescribed gain g2 Zero-Sum differential game

Murad Abu Khalaf Successive Solution- Algorithm 1: Let g be prescribed and fixed. a stabilizing control with region of asymptotic stability 1. Outer loop- update control Initial disturbance 2. Inner loop- update disturbance Solve Value Equation Inner loop update disturbance go to 2. Iterate i until convergence to with RAS Outer loop update control action Go to 1. Iterate j until convergence to , with RAS Cannot solve HJI !! Consistency equation For Value Function CT Policy Iteration for H-Infinity Control--- c.f. Howard

The algorithm converges to , u*, d* Sometimes the algorithm converges to the optimal HJI solution V*, For this to occur it is required that the optimal solution on the RAS the value function increases the RAS decreases the value function decreases the RAS does not decrease Results for this Algorithm For every iteration on the disturbance dione has For every iteration on the control ujone has

Murad Abu Khalaf Problem- Cannot solve the Value Equation! Neural Network Approximation for Computational Technique Neural Network to approximate V(i)(x) Value function gradient approximation is Substitute into Value Equation to get Therefore, one may solve for NN weights at iteration (i,j)

Murad Abu Khalaf Neural Network Optimal Feedback Controller Optimal Solution A NN feedback controller with nearly optimal weights

Finite Horizon Control Cheng Tao Fixed-Final-Time HJB Optimal Control Optimal cost Optimal control This yields the time-varyingHamilton-Jacobi-Bellman (HJB) equation

Cheng Tao Approximating in the HJB equation gives an ODE in the NN weights Solve by least-squares – simply integrate backwards to find NN weights Control is HJB Solution by NN Value Function Approximation Time-varying weights Note that where is the Jacobian

H-infinity OPFB control Theorem Necessary and Sufficient Conditions for Bounded L2 Gain OPFB Control: Jyotirmay Gadewadikar V. Kucera, Lihua Xie

H-Infinity Static Output-Feedback Control for Rotorcraft J. Gadewadikar*, F. L. Lewis*, K. Subbarao$, K. Peng+, B. Chen+, *Automation & Robotics Research Institute (ARRI), University of Texas at Arlington +Department of Electrical and Computer Engineering, National University of Singapore

= the prescribed control input function Adaptive Dynamic Programming Discrete-Time Optimal Control cost Value function recursion Hamiltonian Optimal cost Bellman’s Principle Optimal Control System dynamics does not appear Solutions by Comp. Intelligence Community

Asma Al-Tamimi ADP for H∞ Optimal Control Systems Disturbance Penalty output z d Control u y Measured output where Find control u(t) so that for all L2 disturbances and a prescribed gain g2 when the system is at rest, x0=0.

Discrete-Time GameDynamic Programming: Backward-in-time Formulation • Consider the following continuous-state and action spaces discrete-time dynamical system • The zero-sum game problem can be formulated as follows: • The goal is to find the optimal strategies (State-feedback) for this multi-agent problem • Using Bellman optimality principle “Dynamic Programming”

Showed that this is equivalent to iteration on the Underlying Game Riccati equation Which is known to converge- Stoorvogel, Basar Asma Al-Tamimi HDP- Linear System Case Value function update Solve by batch LS or RLS Control update Control gain A, B, E needed  Disturbance gain

Linear Quadratic case- V and Q are quadratic Asma Al-Tamimi Q learning for H-inf Q function update Control Action and Disturbance updates A, B, E NOT needed 

Asma Al-Tamimi

Continuous-Time Optimal Control System c.f. DT value recursion, where f(), g() do not appear Cost Hamiltonian Optimal cost Bellman Optimal control HJB equation

CT Policy Iteration Utility Cost for any given u(t) Lyapunov equation Iterative solution • Convergence proved by Saridis 1979 if Lyapunov eq. solved exactly • Beard & Saridis used complicated Galerkin Integrals to solve Lyapunov eq. • Abu Khalaf & Lewis used NN to approx. V for nonlinear systems and proved convergence Pick stabilizing initial control Find cost Update control Full system dynamics must be known

LQR Policy iteration = Kleinman algorithm 1. For a given control policy solve for the cost: 2. Improve policy: • If started with a stabilizing control policy the matrix monotonically converges to the unique positive definite solution of the Riccati equation. • Every iteration step will return a stabilizing controller. • The system has to be known. Lyapunov eq. Kleinman 1968

Policy Iteration Solution Policy iteration This is in fact a Newton’s Method Then, Policy Iteration is Frechet Derivative

Draguna Vrabie Now Greedy ADP can be defined for CT Systems • No initial stabilizing control needed • the cost is convergent to the optimal cost • the controller is stabilizing the plant after a number of iterations • any initial Solving for the cost – Our approach ADP Greedy iteration 1. For a given control policy update the cost: 2. Improve policy (discrete update of a continuous time controller): u(t+T) in terms of x(t+T) - OK

Draguna Vrabie a strange pseudo-discretized RE When ADP converges, the resulting P satisfies the Continuous-Time ARE !! Analysis of the algorithm with For a given control policy Greedy update is equivalent to Can show ADP solves the CT ARE without knowledge of the system dynamics f(x)

Draguna Vrabie This extra term means the initial Control action need not be stabilizing a strange pseudo-discretized RE Equivalent to underlying equation Analysis of the algorithm Lemma 2. CT HDP is equivalent to

Solve the Riccati Equation WITHOUT knowing the plant dynamics Model-free ADP Direct OPTIMAL ADAPTIVE CONTROL Works for Nonlinear Systems Proofs? Robustness? Comparison with adaptive control methods?

DT ADP vs. Receding Horizon Optimal Control Forward-in-time ADP Backward-in-time optimization – 1-step RHC

New Books Recent Patent J. Campos and F.L. Lewis, "Method for Backlash Compensation Using Discrete-Time Neural Networks," SN 60/237,580, The Univ. Texas at Arlington, awarded March 2006. Use filter to overcome DT backstepping noncausality problem

International Collaboration Jie Huang Sam Ge

Organized and invited by Professor Jie Huang, CUHK SCUT / CUHK Lectures on Advances in Control March 2005

Sponsored by IEEE Singapore SMC, R&A, and Control Chapters Organized and invited by Professor Sam Ge, NUS UTA / IEEE Singapore Short Course Wireless Sensor Networks for Monitoring Machinery, Human Biofunctions, and Biochemical Agents F.L. Lewis, Assoc. Director for Research Moncrief-O’Donnell Endowed Chair Head, Controls, Sensors, MEMS Group Automation & Robotics Research Institute (ARRI)The University of Texas at Arlington

Mediterranean Control Association Founded 1992 Founding Members M. Christodoulou P. Ioannou K. Valavanis F.L. Lewis P. Antsaklis Ted Djaferis P. Groumpos Conferences held in Crete, Cyprus, Rhodes Israel Italy Dubrovnik, Croatia Kusadasi Turkey 2004 Bringing the Mediterranean Together for Collaboration and Research

Automation & Robotics Research Institute (ARRI) The University of Texas at Arlington