Non-linear System Identification: Possibilities and Problems

Non-linear System Identification: Possibilities and Problems Lennart Ljung Linköping University

Outline • The geometry of non-linear identification: Projections and visualization • Identification for control in a non-linear system world • Ongoing work with Matt Cooper, Martin Enquist, Torkel Glad, Anders Helmersson, Jimmy Johansson, David Lindgren, and Jacob Roll

Geometry of Nonlinear Identification • An elementary introduction

A Data Set Output Input Input

A Simple Linear Model Red: Model Black: Measured Try the simplest model y(t) = a u(t-1) + b u(t-2) Fit by Least Squares: m1=arx(z,[0 2 1]) compare(z,m1)

A Picture of the Model Depict the model as y(t) as a function of u(t-1) and u(t-2) u(t-2)

A Nonlinear Model Try a nonlinear model y(t) = f(u(t-1),u(t-2)) m2 = arxnl(z,[0 2 1],’sigm’) compare(z,m2)

The Predictor Function Identification is about finding a reliable predictor function that predicts the next output from previous measured data General structure Common/useful special case: of fixed dimension m (”state”, ”regressors”) Think of the simple case

The Data and the Identification Process The observed data ZN=[y(1),(1),…y(N),(N)] are N points in Rm+1 The predictor model is a surface in this space Identification is to find the predictor surface from the data:

Outline • The geometry of non-linear identification: Projections and visualization • Identification for control in a non-linear system world

Projections: Examine the Data Cloud • In the plot of the {y(t),(t)} the model surface can be seen as a ”thin” projection of the data cloud. • Example: Drained tank, inflow u(t), level y(t). Look at the points { y(t), y(t), u(t)} in 3D: • What we saw: • How to recognize a ”thin” projection?

Nonlinearities Confined to a Subspace • Predictor model: yt=f(t)+vt , f: Rm-> R • Multi-index structure: f(t)=bt + g(St) g: Rk-> R • S is a k-by-m matrix, k< m: The non-linearity is confined to a k-dimensional subspace (SST=I) • If k=1, the plot yt-btvs Stwill show the nonlinearity g. • How to find b, S and g?

How to Find b, S and g? • Predictor function f(,)=b + g(S(),) •  contains b,  and  •  may parameterize g, e.g. as a polynomial •  may parameterize S e.g. by angles in Givens rotations • This is a useful parametrization of f if the nonlinearity is confined to a lower-dimensional subspace • Minimization of criterion: …

Example: Silver Box Data • Silver box data: …. (NOLCOS Special session) • Fit as above with 5 past y and 5 past u in  and use k=1: (22 parameters) (Sparse data!) y=b + g(S(),)(S: R10 -> R) Simulation fit: 0.44 Fit for ANN (with 617 pars): 0.46 Confined nonlinearities could be a good way to deal with sparsity

More Serious Visualization • The interaction between a user and computational tools is essential in system identification. More should be done with serious visualization of data and estimation results, projections etc. • We cooperate with NVIS: Norrköping Visualization and Interaction Studio, which has a state-of-the art visualization theater. For preliminary experiments we have hooked up the SITB with the visualization package AVS/Express:

Outline • The geometry of non-linear identification: Projections and visualization • Identification for control in a non-linear system world

Control Design Nominal Model Regulator

Control Design True System Regulator

Control Design Model error model Nominal Model Regulator

Control Design Model error model True system Nominal Model Regulator

Control Design Model error model Nominal Model Nominal closed loop system Regulator

Robustness Analysis • All robustness analysis relies upon – one way or another – checking the model error model in feedback with the nominal closed loop system. Some variant of the small gain theorem

Model Error Models = y – ymodel u  uu

Identification for Control • Identifiction for control is the art and technique to design identification experiments and regulator design methods so that the model error model matches the nominal closed loop system in a suitable way

Linear Case • Linear model and linear system: Means that the model error model is also linear. • Much work has been done on this problem (Michel Gevers, Brian Anderson, Graham Goodwin, Paul van den Hof, …) and several useful results and insights are available. • Bottom line: Design experimens so that model is accurate in frequency ranges where the stability margin is essential. • Now for the case with nonlinear system ….

Non-linear System Approximation • Given an LTI Output-error model structure y=G(q,)u+e, what will the resulting model be for a non-linear system? • Assume that the inputs and outputs u and y are such that the spectra u and yu are well defined. • Then the LTI second order equivalent (LTI-SOE) is Note: G0 depends on u • The limit model will be

Example • Consider the static system z(t)= u3(t) • Let u(t) = v(t)-2cv(t-1)+c2v(t-2) where v is white noise with uniform distribution • Then the LTI equivalent of the system is • Note: (1) Dynamic!(2) Static gain: (=0.01,c=0.99): 233

Additivity of LTI-SOE • Note that the LTI equivalent is additive (under mild conditions):

Simulation Blue: without NL term Red: With NL term

Bode Plot • Blue: Estimated (LTI equivalent) model • Green: ”Linear part”

Lesson from the Example • So, the gain of the model error model for |u|<1 is 0.01 if the green linear model is chosen. • And the gain of the model error model is (at least) 230 if the blue linear model is chosen. • Unfortunately, System Identification will yield the blue model as the nominal (LTI-SOE) model! • Lesson #1: The LTI-SOE linear model may not be the nominal linear model you should go for!

Gain of Model Error Models • Idea #1: • Traditional definition, possible problems with relay effect in the origin • Idea #2: Affine power gain

Model Error Model Gain • So go for • For all u? • Impossible to establish • Very conservative, typically relative error 1 at best. • Lesson #2: For NL MEM necessary to let • Must consider non-linear regulator!

Possible Result for Nonlinear System • Nominal model, linear or nonlinear • Design an H1 non-linear regulator with the constraint and gain  from output disturbance to controlled variable z • The model error model obeys • Then • Where V(x(0)) is the ”loss” for the nominal closed loop system

Conclusions • Geometry of non-linear identification: Projections and visualization • Identification for control with non-linear systems: • LTI-SOE may not be the best model • Non-linear control synthesis necessary even with linear nominal model

Epilogue • Four Challenges for the Control Community: • A working theory for stability of black-box models. • Prediction/Simulation • Fully integrated software for modeling and identification • Object oriented modeling • Differential algebraic equations • Full support of disturbance models • Robust parameter initialization techniques • Algebraic/Numeric • Dealing with LTI-equivalents for good control design

Global Patterns: Lower Dimensional Structures • In the linear case, experience shows that the ”data cloud” often is concentrated to lower dimensional subspaces. This is the basis for PCA and PLS. • Corresponding structure in the nonlinear case: • f()=g(P); P: m | n matrix, m<<n • How to find P? (”the multi-index regression problem”) • Note that sigmodial neural networks use basis functions fk=(k -k) where  is a scalar product (”ridge expansion”). This is a similar idea (m=1), that partly explains the success of these structures,

More Flexibility A more flexible, nonlinear model y(t) = f(u(t-1),u(t-2)) m3 = arxnl(z,[0 2 1],’sigm’,’numb’,100) compare(z,m3) compare(zv,m3)

The Fit Between Model and Data

Some Geometric Issues • Look at the Data Cloud and figure out what may be good surface candidates (model structures) • The cloud may be sparse.

How to Recognize a Thin Projection? • Idea #1: Measure the area of a collection of points by the area of its covariance ellipsoid: • SVD, Principal components, TLS etc: Linear models

How to Recognize a Thin Projection? • Idea #1: Measure the area of a collection of points by the area of its covariance ellipsoid: • SVD, Principal components, TLS etc: Linear models • Idea #2: Delaunay Triangulation (Zhang) • OK, but non-smooth criterion • Idea #3: ….

How to Deal with Sparsity • Sparsity: Think of Johan Schoukens’s Silver box data: 120000 data points and 10 regressors • Need ways to interpolate and extrapolate in the data space. • Use Physical Insight: Allow for few parameters to parameterize the predictor surface, despite the high dimension. • Leap of Faith: Search for global patterns in observed data to allow for data-driven interpolation.

Non-linear System Identification: Possibilities and Problems