1 / 68

Computacion Inteligente

Computacion Inteligente. Least-Square Methods for System Identification. Contents. System Identification: an Introduction Least-Squares Estimators Statistical Properties of least-squares estimators Maximum likelihood (ML) estimator Maximum likelihood estimator for linear model

lud
Download Presentation

Computacion Inteligente

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computacion Inteligente Least-Square Methods for System Identification

  2. Contents • System Identification: an Introduction • Least-Squares Estimators • Statistical Properties of least-squares estimators • Maximum likelihood (ML) estimator • Maximum likelihood estimator for linear model • LSE for Nonlinear Models • Developing Dinamic models from Data • Example: Tank level modeling

  3. System Identification: Introduction • Goal • Determine a mathematical model for an unknown system (or target system) by observing its input-output data pairs

  4. System Identification: Introduction • Purposes • To predict a system’s behavior, • As in time series prediction & weather forecasting • To explain the interactions & relationships between inputs & outputs of a system

  5. System Identification: Introduction • Context example • To design a controller based on the model of a system, • as an aircraft or ship control • Simulate the system under control once the model is known

  6. Why cover System Identification • System Identification • It is a well established and easy to use technique for modeling a real life system. • It will be needed for the section on fuzzy-neural networks.

  7. Spring Example Experimental data What will the length be when the force is 5.0 newtons?

  8. Components of System Identification • There are 2 main steps that are involved • Structure identification • Parameter identification

  9. Structure identification • Structure identification • Apply a-priori knowledge about the target system to determine a class of models within which the search for the most suitable model is to be conducted This class of model is denoted by a function y = f(u,) where: • y is the model output • u is the input vector •  is the parameter vector

  10. Structure identification • Structure identification • f(u,)depends on • the problem at hand • the designer’s experience • the laws of nature governing the target system

  11. Parameter identification • Training data is used for both system and model. • Difference between Target System output, yi, and Mathematical Model output, yi, is used to update parameter vector, θ. ^

  12. Parameter identification • Parameter identification • The structure of the model is known, however we need to apply optimization techniques • In order to determine the parameter vector such that the resulting model describes the system appropriately:

  13. System Identification Process • The data set composed of m desired input-output pairs • (ui, yi) (i = 1,…,m) is called the training data • System identification needs to do both structure &parameter identification repeatedly until satisfactory model is found

  14. System Identification: Steps • Specify & parameterizea class of mathematical models representing the system to be identified • Perform parameter identification to choose the parameters that best fit the training data set • Conduct validation set to see if the model identified responds correctly to an unseen data set • Terminate the procedure once the results of the validation test are satisfactory. Otherwise, another class of model is selected & repeat step 2 to 4

  15. System Identification Process Structure and parameter identification may need to be done repeatedly

  16. Least-Squares Estimators

  17. estimate Objective of Linear Least Squares fitting • Given a training data set {(ui, yi), i = 1, …, m} and the general form function: • Find the parameters 1, …, n , such that

  18. The linear model • The linear model: y = 1 f 1(u) + 2 f2(u) + … + nfn(u) = fT(u, ) where: • u = (u1, …, up)T is the model input vector • f1, …, fn are known functions of u • 1, …, n are unknown parameters to be estimated

  19. Least-Squares Estimators • The task of fitting data using a linear model is referred to as linear regression where: • u = (u1, …, up)T is the input vector • f1(u), …, fn(u) regressors • 1, …, n parameter vector

  20. Least-Squares Estimators • We collect training data set {(ui, yi), i = 1, …, m} System’s equations becomes: Which is equivalent to: A = y

  21. Least-Squares Estimators • Which is equivalent to: A = y • where m*n matrix n*1 vector m*1 vector unknown A = y   = A-1y (solution)

  22. Least-Squares Estimators • We have • m outputs, and • n fitting parameters to find • Or • m equations, and • n unknown variables Usually m is greater than n

  23. Least-Squares Estimators • Since • the model is just an approximation of the target system & • the data observed might be corrupted, • Therefore • an exact solution is not always possible! • To overcome this inherent conceptual problem, an error vector e is added to compensate A + e = y

  24. estimate Least-Squares Estimators • Our goal consists now of finding that reduces the errors between and • The problem: Find,

  25. Least-Squares Estimators • If e = y - A then: We need to compute:

  26. Least-Squares Estimators • Theorem [least-squares estimator] The squared error is minimized when  satisfies the normal equation if is nonsingular, is unique & is given by is called the least-squares estimators, LSE

  27. Spring Example • Structure Identification can be done using domain knowledge. • The change in length of a spring is proportional to the force applied. • Hooke’s law length = k0 + k1*force

  28. Spring Example

  29. Statistical Properties of least-squares estimators

  30. Statistical qualities of LSE • Definition [unbiased estimator] An estimator of the parameter  is unbiased if where E[.] is the statistical expectation

  31. Statistical qualities of LSE • Definition [minimal variance] • An estimator is a minimum variance estimator if for any other estimator *: where Cov() is the covariance matrix of the random vector 

  32. Statistical qualities of LSE • Theorem [Gauss-Markov]: • Gauss-Markov conditions: • The error vector e is a vector of muncorrelated random variables, each with zero mean & the same variance2. • This means that:

  33. Statistical qualities of LSE • Theorem [Gauss-Markov] LSE is unbiased & has minimum variance. Proof:

  34. Maximum likelihood (ML) estimator

  35. Maximum likelihood (ML) estimator • The problem • Suppose we observe m independent samples x1, x2, …, xm, • coming from a probability density function with parameters 1, …, r

  36. Maximum likelihood (ML) estimator • The criterion for choosing  is: • Choose parameters  that maximize data probability Which one do you prefer? Why?

  37. Maximum likelihood (ML) estimator • Likelihood function definition: • For a sample of n observations x1, x2, …, xm • with independent probability density function f, • the likelihood function L is defined by L isthe joint probability density

  38. Maximum likelihood (ML) estimator • ML estimator is defined as the value of  which maximizes L: or equivalently:

  39. Maximum likelihood (ML) estimator • Example: ML estimation for normal distribution • Suppose we have m indipendent samples x1, x2, …, xm, coming from a Gaussian distribution with parameters μ and σ2. Which is the MLE for μ and σ2?

  40. Maximum likelihood (ML) estimator • Example: ML estimation for normal distribution • For m observations x1, x2, …, xm, we have:

  41. Maximum likelihood (ML) estimator • Example: ML estimation for normal distribution • For m observations x1, x2, …, xm, we have:

  42. Maximum likelihood estimator for linear model

  43. Maximum likelihood estimator for linear model • Let a linear model be given as • Then • here e has PDF pe(u,θ) (independent). The likelihood function is given by

  44. Maximum likelihood estimator for linear model • Asume a regression model where errors are distributed normally with zero mean. • The likelihood function is given by

  45. Maximum likelihood estimator for linear model • The maximum likelihood model • Any algorithm that maximizes • gives de Maximum likelihood model with respect to a given family of possible models

  46. Maximum likelihood estimator for linear model • Same as maximizing • Same as minimizing

  47. Connection to Least Squares • Conclusion • The least-squares fitting criterion can be understood as emerging from the use of the maximum likelihood principle for estimating a regression model where errors are distributed normally. • The applicability of the least-squares method is, however, not limited to the normality assumption.

  48. LSE for Nonlinear Models

  49. LSE for Nonlinear Models • Nonlinear models are divided into 2 families • Intrinsically linear • Intrinsically nonlinear • Through appropriate transformations of the input-output variables & fitting parameters, an intrinsically linear model can become a linear model • By this transformation into linear models, LSE can be used to optimize the unknown parameters

  50. LSE for Nonlinear Models • Examples of intrinsically linear systems

More Related