SYSTEMS Identification

SYSTEMSIdentification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart Ljung

Lecture 16 Model Structure Selection and Model Validation Topics to be covered include: • General Aspects of the Choice of the Model Structure • A Priori Consideration • Model Structure Selection Based on Preliminary Data Analysis • Comparing Model Structures • Model Validation • Residual Analysis

Introduction In chapters 4 and 5 we provided list of typical model structures to be used for the identification. In this chapter we shall complement this list by discusing how to arrive at a suiable structure guided by system knowledge and collected data set. Once a model structure has been chosen, the identification procedure provides us with a particular model in this structure This model may be the best available one, but the crucial question is whether it is good enough for the intended purpose. Testing if a given model is appropriate is known model validation

General Aspects of the Choice of the Model Structure • General Aspects of the Choice of the Model Structure • A Priori Consideration • Model Structure Selection Based on Preliminary Data Analysis • Comparing Model Structures • Model Validation • Residual Analysis

General Aspects of the Choice of the Model Structure 1-To choose the type of model set This involves the selection between nonlinear and linear models, between input-output, black-box and physically parameterized state-space models, and so on To route to a particular model structure involves at least three steps: 2-To choose the size of the models set. This involves issues like selecting the order of a state-space model. The degrees of the polynomials in a model or the number of “neurons” in a neural network. It also contains the problem of which variables to include in the model description. We thus have to select M from a given increasing chain of structures

General Aspects of the Choice of the Model Structure 2-To choose the size of the models set.

General Aspects of the Choice of the Model Structure 3- To choose the model parameterization When a model set M* has been decided on, it remains to parameterize it, that is to find a suitable model structure M whose range equals M*

Quality • Price General Aspects of the Choice of the Model Structure 3- To choose the model parameterization When a model set M* has been decided on, it remains to parameterize it, that is to find a suitable model structure M whose range equals M* The goal of the user is to obtain a good model at a low price. The choice of model structure certainly has a considerable effect on both the quality of the resulting model and the price for it.

General Aspects of the Choice of the Model Structure Quality of the Model Section 12.1 The options that effects on quality of the resulting model structure: • How to perform the identification experiment • What model structures to choose • What identification algorithm to apply • How to validate the obtained model

Flexibility: employing model structures that offer good capabilities of describing different possible systems. Flexibility can be obtained either by using many parameters or by placing them in “strategic positions” • Parsimony: note to use unnecessarily many parameters: to be “parsimonious” with the model parameterization. General Aspects of the Choice of the Model Structure Quality of the Model

General Aspects of the Choice of the Model Structure Price of the Model The price of the model is associated with the effort to calculate it, that is to perform the minimization in Or to solve the equation

The algorithm complexity: We saw in chapter 10 that solving for involves evaluation of the prediction errors and their gradients for a number of . The work associated with these evaluations depends critically on M. • The properties of the criterion function: The amount of work to solve for also depends on how many evaluations of the criterion function and its gradient are necessary. This is determined by the “shape” of the criterion function. The shape in turn is a result of the choice of and of how the depend on . General Aspects of the Choice of the Model Structure This work is highly dependent on the model structure, which influences:

General Aspects of the Choice of the Model Structure A high-order complex model is more difficult to use for simulation and control design. If it is only marginally better than a simpler model. It may not be worth the higher price. Consequently, also • The intended use of the model • Will affect the choose of the model structure. General Considerations The final model structure is a compromise between the below listed aspects • Flexibility • Parsimony • The algorithm complexity • The properties of the criterion function • The intended use of the model

General Aspects of the Choice of the Model Structure The techniques and considerations that are used when evaluating General Consideration can be split into different categories. • A priori considerations: Certain aspects are independent of the data set ZN and can be evaluated a priori before the data have been measured. (Section 16.2) • Techniques based on preliminary data analysis: With the data available, certain testing and evaluation of ZN can be carried out that give insights into possible and suitable model structures. These techniques do not necessarily require the computation of a complete model. (Section 16.3) • Comparing different model structures: Before a final model structure is chosen it is advisable to shop around in different model structures and compare quality and prices of the models offered there. This will require the computation and comparison of several models. (Section 16.4) • Validation of a given model: Regardless of how a given model is obtained, we can always use ZN to evaluate whether it seems likely that it will serve its purpose. If a certain model is accepted, we have also implicitly approved the choice of the underlying model structure. (Sections 16.5 and 16.6)

A Priori Consideration • General Aspects of the Choice of the Model Structure • A Priori Consideration • Model Structure Selection Based on Preliminary Data Analysis • Comparing Model Structures • Model Validation • Residual Analysis

A Priori Considerations Type of Model The choice of which type of model to use is quite subjective and involves several issues that are independent of the data set ZN. The compromise between parsimony and flexibility is at the heart of the identification problem. How shall we obtain a good fit to data with few parameters? The answer usually is to use a priori knowledge about the system intuition and ingenuity. It will depend on our insight and understanding of the process whether it is feasible to build a well-founded physically parameterized model structure. This is of course an application-dependent problem.

A Priori Considerations For a physical system, a priori information can typically best be incorporated into a continuous time model such as This means that the computation of ε(t,θ) and the minimization of VNbecome a laborious task both regarding the programming effort and the computation time required. Aspects of algorithmic complexity as well as the shape of the criterion function therefore favor black-boxmodels. By this we mean a model like general model that adapts its parameters to data. There is no need any physical interpretation of their values.

A Priori Considerations A general advice is to “try simple things first”. Then one should go into sophisticated model structure if …. So simple linear regression model Is a good first choice for an identification problem. One should note that using physical a priori knowledge does not necessarily mean that fancy continuous-time model structures have to be constructed. Some thinking about the nature of the relationship between the measured signals can give good hints for model structures. In general, one should contemplate whether nonlinear transformations of data will make it easier for the transformed data to fit a linear model.

A Priori Considerations Model Order Choose the size of model set usually requires help from the data. However, physical insight and the intended model application will often tell which range of model orders should be considered. Also, even when the data have not been evaluated, knowing N and the data quality will indicate how many parameters it is feasible to estimate. With few data points, it is not reasonable to try to determine a model in a complex model structure. A related problem is how many different time scales it is feasible to let one and the same model handle. For numerical reasons it may be difficult to adequately describe more that 2 or three decades of the frequency range within one model. (Problem 13G.1) Considerations on sampling rates, proper excitation, and data record lengths strongly suggest that one should not aim at covering more than three decades of time constants in one experiment.

A Priori Considerations Model Order If the system is stiff so that it contains widely separated time constants of interest, the calculation thus is to build two or more models, each covering a proper part of the frequency range and each sampled with a corresponding suitable sampling interval. For a high frequency model, the low-frequency dynamics for all practical purposes look like integrators The number being equal to the pole excess at low frequencies Correspondingly, the high-frequency dynamics look like static relationships to the low-frequency model. Thus introduce a no-delay term b0u(t) in this model

A Priori Considerations Model Parameterization The issue of model parameterization is basically numerical We seek model parameterizations that are well conditioned so that a round-off or other numerical error in one parameter has a small influence on the input-output behavior of the model. This is a problem that has been widely recognized in the digital filtering area, but less so in the identification literature. In fact the standard input-output model structures could be quite sensitive to numerical errors.

Model Structure Selection Based on Preliminary Data Analysis • General Aspects of the Choice of the Model Structure • A Priori Consideration • Model Structure Selection Based on Preliminary Data Analysis • Comparing Model Structure • Model Validation • Residual analysis

Model Structure Selection Based on Preliminary Data Analysis By preliminary data analysis, we mean calculations that do not involve the determination of a complete model of the system. Estimating The Type of Model Generally, data-aided model structure selection appears to be an underdeveloped field. An exception is the order determination in linear structure. Order Estimation The order of a linear system can be estimated in many different ways. Methods that are based on preliminary data analysis fall into the following categories. 1. Examining the spectral analysis estimate of the transfer function 2. Testing ranks in sample covariance matrices 3. Correlating variables 4. Examining the information matrix

A nonparametric estimate of the transfer function will give valuable information about resonance peaks and the high-frequency roll-off and phase shift. Model Structure Selection Based on Preliminary Data Analysis 1.Special Analysis Estimate All this gives a hint as to what model orders will be required to give an adequate description of the dynamics. Note, though, that discrete-time Bode plots show some artifacts in their interpretation in terms of poles and zeros, compared to continuous-time Bode plots. Thus, use the observations with some care.

Model Structure Selection Based on Preliminary Data Analysis 2.Testing ranks in covariance matrices Suppose that the true system is described by For some noise sequence {v0(t)}. Suppose also that n is the smallest number for which this holds. As usual, let Suppose first that v0(t)=0. Then

Model Structure Selection Based on Preliminary Data Analysis Now suppose that v0(t)≠0. Then one can use a threshold provided the signal-to-noise ratio is high. If this is not the case, Woodside(1971) suggested the use of the enhanced matrix A better alternative, is to use other correlation vectors. (See Wellstead (1978) and Wellstead and Rojas (1982).

Model Structure Selection Based on Preliminary Data Analysis 3. Correlating variables The order-determination problem is whether to include one more variable in a model structure or not. This variable could be y(t-n-1) or a measured possible disturbance variable ω(t). In any case, the question is whether this new variable has anything to contribute when explaining the output variable y(t). This is measured by the correlation between y(t) and ω(t). However to discount the possible relationship between ω(t) and y(t) what remains to be explained, already accounted for by the smaller model structure. This is known as coninocal correlation or partial correlation in regression analysis. (See Draper and Smith. 1981) We may also note that the determination of the state-space model order, i.e., determining how many of singular values are significant is a test of the same kind.

Model Structure Selection Based on Preliminary Data Analysis 4. The information matrix It follows from theorem 4.1 that, if the model orders are overestimated in certain model structures, global and local identifiability will be lost. This means that ψ(t,θ) will not have full rank at θ=θ*. And hence the information matrix will be singular. Since the Gauss-Newton search algorithm uses the inverse of the information matrix, a natural test quantity for whether the model order is too high will be the conditioning number of this matrix. A related situation occurs when the IV method is used. Then the matrix will be singular when the orders are overestimated. So testing the conditioning of …..

Comparing Model Structure • General Aspects of the Choice of the Model Structure • A Priori Consideration • Model Structure Selection Based on Preliminary Data Analysis • Comparing Model Structure • Model Validation • Residual Analysis

Comparing Model Structures A most natural approach to search for a suitable model structure is simply to test number of different ones to compare the resulting models. The model to be evaluated will generically be denoted by It is estimated within the model structure M, which have dM=dimθfree parameters. Estimation Data we mean the data that were used to estimate m. Validation Datawill denote any data set available that has not been used to build any of the models we would like to evaluate.

Comparing Model Structures What to compare? There are of course a number of ways to evaluate a model. We shall here describe evaluations and comparisons that are based on data sets from the system. Suppose that the data sets have been collected under conditions that are close to the intended operating conditions. The model tests are then basically tests of: How well the model is capable of reproducing these data.

We shall generally work with k-step ahead model predictions as the basic of comparisons. For a linear model we thus have Comparing Model Structures What to compare?

For a linear model we thus have Otherwise, note the considerable conceptual difference between and . Comparing Model Structures What to compare? For an output error model, H(q)=1, so, The latter has y(t-1) and earlier y-values available and can therefore give fits that “look good” even though the model may be bad.

Visual inspection of plots y(t) and • Find the numerical value Comparing Model Structures What to compare? The models can evaluated by

and Comparing Model Structures Comparing Models on Fresh Data Sets: Cross-Validation It is not so surprising that a model will be able to reproduce the estimation data. A suggestive and attractive way of comparing two different models m1 and m2 is to evaluate their performance on validation data, e.g. by comparing We would then favor that model that shows the better performance. Such procedures are known as cross-validation and several variants have been developed. See for example, Stone (1974) and Snee (1977). Advantage: An attractive feature of cross-validation procedures is their pragmatic character: the comparison makes sense without any probabilistic arguments and without any assumptions about the true system. Disadvantage: we have to save a fresh data set for the validation, and therefore cannot use all our information to build the models.

Comparing Model Structures Comparing Models on Second-hand Data Sets: Evaluating the Expected Fit The proper quality measure for the model m is the expected criterion If the comparison criterion coincides with the estimation criterion we have

Comparing Model Structures A Pragmatic Preview. The model obtained in the larger model structure will automatically yield a smaller value of the criterion of fit. Since it is the minimizing value obtained by minimization over a larger set. As the model structure increases, the minimal value of the criterion will thus behave as depicted in figure It is a monotonically decreasing function of model structure flexibility.

Comparing Model Structures To begin with, the value VN decreases since the model picks up more of the relevant features of the data. But even after a model structure has been reached that allows a correct description of the system, the value V continuous to decrease, now because the additional (unnecessary) parameters adjust themselves to features of the particular realization of the noise. This is known as overfit and this extra improve fit is of course of no value to us, since we are going to apply the model to data with different noise realizations. It is reasonable that the decrease from overfit should be less significant than the decrease that results when more relevant features are included in the model. We will thus be looking for the “knee” in the curve of the figure. Now we are going to clear above sentense.

Let be the minimizing argument of and suppose Comparing Model Structures A Format Result. For the case that the comparison criterion coincides with the estimation criterion. We have the following result: Theorem 16.1 Let Then asymptotically as

Note the importance difference between and . isthe averages of the fits of the models as they are fitted to estimation data. is the average as the estimated models are evaluated on validation data, since Comparing Model Structures If we generate many estimation and validation data sets in a Monte-Carlo manner:

Let: Comparing Model Structures A Format Result. Theorem 16.1 + Akaike’s Final Prediction-Error Criterion (FPE)

Number of unknown parameters However, each parameter carries a variance penalty that will contribute with to the expected mean square error fit. Any parameter that improves the fit of VN by less that will thus be harmful in this respect. Comparing Model Structures Shows the fundamental cost of parameters. The more parameters are used by the model structure the smaller the first term will be.

A suitable estimate for is thus obtained as Now, is not known. But can easily be estimated. Comparing Model Structures The expression

Comparing Model Structures which inserted into gives

Comparing Model Structures gives

Model Validation • General Aspects of the Choice of the Model Structure • A Priori Consideration • Model Structure Selection Based on Preliminary Data Analysis • Comparing Model Structure • Model validation • Subspace Methods for Estimating State Space Models.

Model Validation The parameter estimation procedure picks out the “best” model within the chosen model structure. The crucial question then is whether this “best” model is “good enough” This is the problem of model validation. The question has several aspects: A general family of search routines is given by 1. Does the model agree sufficiently well with the observed data? There is always certain purpose with the modeling. 2. Is the model good enough for my purpose? 3. Does the model describe the “true system”? Philosophically, impossible to answer. Model validation techniques thus tend to focus on question 1.

Model Validation 1. Does the model agree sufficiently well with the observed data? • Validation with Respect to the Purpose of the Modeling • Feasibility of Physical Parameters • Consistency of Model input-Output Behavior • Model Reduction • Parameter Confidence Intervals • Simulation and Prediction • A particularly useful technique, residual analysis (Section 16.5)

Model Validation Validation with Respect to the Purpose of the Modeling It might be that the model is required for: Regulator design, prediction, or simulation. The ultimate validation then is to test whether the problem that motivated the modeling exercise can be solved using the obtained model. If a regular based on the model gives satisfactory control, then the model was a valid one. Feasibility of Physical Parameters For a model structure that is parameterized in terms of physical parameters Consider the estimated value and their estimated variances with what is reasonable from prior knowledge. Also one can evaluate the sensitivity of the input-output behavior with respect to parameters to check their practical identifiability.

It is always good practice to evaluate and compare different linear models in bode plots, possibly with the estimated variance translated to confidence intervals of and . Model Validation Consistency of Model input-Output Behavior • Consider input-output properties. For black-box models. For linear models • Use Bode diagrams. For nonlinear models • Inspected by simulation. Model Reduction One procedure that tests if the model is simple and appropriate system description is: Apply some model-reduction technique to it, if the model order can be reduced without affecting the input-output properties very much, then the original model was “unnecessarily complex”

SYSTEMS Identification