450 likes | 607 Views
The Application of Partial Least Squares to Non-linear Systems in the Process Industries. Elaine Martin and Julian Morris Centre for Process Analytics and Control Technology CPACT School of Chemical Engineering and Advanced Materials University of Newcastle, England.
E N D
The Application of Partial Least Squares to Non-linear Systems in the Process Industries Elaine Martin and Julian Morris Centre for Process Analytics and Control Technology CPACT School of Chemical Engineering and Advanced Materials University of Newcastle, England
Overview of the Presentation • Motivation for the Application of “Data Mining” in Non-linear Process Systems • Process Modelling and Analysis of Non-linear Systems • Constrained Partial Least Squares • Local Linear Modelling • Prediction Intervals for Non-linear Partial Least Squares • Conclusions
Data Rich Information Poor Enhanced Profitability and Improved Customer Satisfaction Modern Process Control Systems Process Monitoring for Early Warning and Fault Detection Process Optimisation
Process Modelling • Mechanistic models developed from process mass and energy balances and kinetics provide the ideal form given: • process understanding exists • time is available to construct the model. • Data based models are useful alternatives when there is: • limited process understanding • process data available from a range of operating conditions. • Hybrid models combine the two approaches.
Process Modelling • Traditionally two types of variables have been used in the development of a process model/process performance monitoring scheme: • Process variables (X) • Quality variables (Y) • In practice, a third class exists: • Confounding variables (Z). • A confounding variable is any extraneous factor that is related to, and affects, the two sets of variables under study (X) and (Y). • It can result in a distortion of the true relationship between the two sets of variables, that is of primary interest.
Global Process Variation Confidence ellipse including confounding variation Mal-operation X X X X X X X Trajectory of confounding variable Confidence ellipse excluding confounding variation
Partial Least Squares X-block outer relationship (monitoring) Inner relationship (prediction) Y-block outer relationship (monitoring) X and Y-block scoresare calculated recursively
Constrained PLS • To exclude the nuisance source of variability, a necessary condition is that the derived latent variables, , and , are not correlated with the confounding variables: and for . • The idea of constrained PLS is to apply the constraints to ordinary PLS.
Constrained PLS • Standard constrained optimisation techniques can be used to solve the equations in each iteration. • An algorithm has been developed that enhances the efficiency of the constrained PLS algorithm. • The other steps of constrained PLS are as for ordinary PLS. • The resulting latent variables can then be used for process monitoring with the knowledge that they are not confounded with the nuisance source of variability. • Any unusual variation detected from these latent variables can then be assumed to be related to abnormal process behaviour.
Industrial Application • An industrial semi-discrete batch manufacturing operation is used to illustrate the advantages of the constrained PLS algorithm over ordinary PLS. • The process involves the production of a variety of products (recipes), some of which are only manufactured in small quantities to meet the requirements of specialised markets. • The objective of the analysis was to build a monitoring scheme to detect the onset of subtle changes in production and final product quality.
An Industrial Application • For simplicity, three recipes are selected to demonstrate the methodology. • A total of thirty-six process variables, including flow rates, pressures and temperatures, are recorded every minute, whilst five quality variables are measured off-line in the quality laboratory every two hours. • A nominal process monitoring scheme was developed using both ordinary PLS and constrained PLS from 41 ‘ideal’ batches. • A further 6 batches, A4, A10, A29, A35, A38 and B32 were used for model validation. These batches were known to lie outside the desirable specification limits.
Industrial Application Ordinary Partial Least Squares Latent variable 1 V Latent variable 2 Latent variable 3 V Latent variable 4
Industrial Application Ordinary Partial Least Squares Bivariate Scores Plot Hotelling’s T2 and SPE
Industrial Application Constrained Partial Least Squares LV 1 versus LV 2 LV 3 versus LV 4
Industrial Application Constrained Partial Least Squares Hotelling’s T2 Squared Prediction Error
Constrained PLS - Conclusions • Constrained PLS possesses the following important characteristics: • It removes that information correlated with the confounding variables. • The information excluded by constrained PLS contains only variation associated with the confounding variables. • The derived constrained PLS latent variables achieve optimality in terms of extracting as much of the available information as possible contained in the process and quality data.
Local Linear and Non-linear Multi-way Partial Least Squares Batch Monitoring
Batch Process Modelling and Monitoring • Batch processes exhibit non-linear, time variant and dynamic behaviour. • These characteristics challenge the linear multivariate statistical technique of multi-way Partial Least Squares (PLS) that has traditionally been applied in batch process performance monitoring. • A local model based approach has been developed to overcome these limitations.
Local Model Approach • Batch processes often exhibit distinct phases of process operation thus instead of modelling a non-linear time variant batch process as a global model, batch trajectories are sub-divided into individual operating regions. • A local linear PLS model is then developed for each operating region • Each model can comprise a different number of latent variables. • A validity function then creates a smooth transition between the local models to build a global non-linear model.
Validity Function • The validity function determines which operating region the process lies within at each time point: • Identification of the most appropriate local model • Weighting of local models if two or more are applicable • The validity function is based on a fuzzy logic rule based function: • Rules based on process variable behaviour IF x1is LOWAND x2is HIGH THEN model 1 is valid
Dynamic Feature Addition • Batch process variables also exhibit serial and cross correlation. • Auto Regressive with eXogenous inputs (ARX) structure is a time series structure used to model such data • Including past input and output process variables into the X data matrix of a PLS model encapsulates some of the dynamic features within the model.
Application to an Industrial Process • A fed-batch fermentation process is used to demonstrate local model performance monitoring. • 17 batches with good operating conditions and high yield were selected for the nominal model. • 30 batches with standard operating conditions but mid to low yield were used to assess the monitoring charts. • A model was developed using local dynamic PLS and global dynamic PLS.
Operating Region Specification • Operating regions specified using process knowledge • 4 operating regions identified • Regions based on conditions within the fermenter • Operating region 1: initial start up of the fermenter before optimum conditions are reached • Operating region 2: initialisation of product growth • Operating region 3: maximum growth rate of product • Operating region 4: reactions are complete
Operating Region Specification pH Addition rate of chemical A Potency
Validity Function • Fuzzy logic rules used to determine movement between operating regions • Rules applied to • Power, Substrate Addition Rate, Respiration Rate
Global Dynamic PLS Predicted and Actual Values of Potency Residuals of Global Dynamic PLS Model
80 70 60 50 potency 40 30 20 10 0 0 50 100 150 200 250 300 350 400 observation number Prediction using Local Dynamic PLS Model Predicted and Actual Potency for Each Model Residuals of Local Dynamic PLS models
Performance Monitoring and Fault Detection Local SPE chart - varying control limit Global SPE chart - constant control limit
Fault Detection Process fault detected Local SPE chart False alarm Global SPE chart
Conclusions • Inclusion of dynamic behaviour improves model performance through the removal of process structure within the model • Fuzzy model rule based validity function approach allows batch specific movement between model • Local model approach to performance monitoring leads to control charts with improved model limits • Local model monitoring charts detect faults and process deviations earlier than the global model equivalent
Non-linear Partial Least Squares Prediction Intervals and Leverage
Non-linear Partial Least Squares • A simple approach to non-linear PLS has been to extend the input matrix (X) by including non-linear combinations of the original variables (such as logarithms, square values, cross-products, etc.) and then performing linear PLS. • If there is no a priori knowledge, then there is no limitation as to the number (and kind) of transformation that might be applied. • Thus by pre-treating data sets in this way, the number of non-linear terms can increase excessively, resulting in large input and output matrices and the results become difficult to interpret.
Non-linear Partial Least Squares • A more structured approach to the development of a non-linear PLS model is to modify the NIPALS algorithm by introducing a non-linear function that relates the output scores u to the input scores t, without modifying the input and output variables: • Wold et al (1989) proposed a non-linear PLS algorithm which retained the framework of linear PLS but that used second order polynomial (quadratic) regression: uj = c0j+ c1j tj + c2j tj 2+ ej
Prediction Intervals for Non-linear PLS • As for every regression technique, a measure for assessing the reliability of the predicted values is required. • A common approach is through the use of prediction intervals. These are the upper and lower confidence limits of the predicted values. • The larger the magnitude of these intervals, the less precise is the prediction. • A methodology used to evaluate prediction intervals for neural network models has been extended to linear and non-linear partial least squares algorithms.
Calculation of Prediction Intervals • The prediction intervals are computed using a first order Taylor series expansion and the Jacobian matrix of the functional mapping provided by the PLS algorithms. • Given a set of input and output training data, X and Y, respectively, a PLS regression model is built and the Jacobian matrix F is computed for the same set of training data • When the PLS regression model is used to predict a new output value, corresponding to a new sample of input variables, the vector of partial derivatives is computed and the prediction interval is evaluated
Case Study • The data were generated from the simulation of a pH neutralisation system. • Samples were collected under steady state operating conditions, thus no time correlation existed between any two consecutive samples. • The data included four input variables (flowrates of the inlet and outlet streams of the neutralisation tank) and one output variable (pH value measured in the outlet stream) and were noise free.
Radial Basis Function PLS • An error based up-datingpartial least squares radial basis function PLS model was built using 350 data samples. • It was constructed from one latent variable with twenty one nodes included in the inner radial basis function model. • In excess of 99% of the total variance of the output variable was captured by this representation.
Radial Basis Function PLS Time Series Plot for the Test Data with Predictions
Leverage • The quantity is similar in form to leverage. • It can be used to provide an additional metric for assessing the quality of the regression model. • This is achieved by computing the critical value of the chi-square distribution with degrees of freedom, for predefined confidence levels, e.g. 95% and 99%, and plotting the value of for each sample and the critical value of the distribution divided by (n-p).
Leverage • When the ‘leverage’ is smaller than the critical value, the corresponding predicted value is considered to be reliable with the predefined confidence level and vice versa, when the ‘leverage’ is larger than the limit, the predicted value is considered to be unreliable.
Radial Basis Function PLS Leverage for the Test Data Prediction Intervals
Conclusions - PLS Prediction Intervals • A methodology proposed for prediction intervals in neural network modelling was extended to non-linear PLS algorithms. • This approach was known to give approximate, but generally reliable, results whilst being less computationally expensive than other more mathematically precise approaches such as the likelihood, lack-of-fit, jackknife and bootstrap. • The development of the algorithm led to the definition of a metric, the leverage, which can be used in conjunction with, or as an alternative to, prediction intervals.
Conclusions DATA RICH INFORMATION POOR DATA INFORMATION KNOWLEDGE
Acknowledgements • EBM acknowledges Dr Pino Baffi, Dr Baibing Li, Miss Nicola Fletcher and colleagues in CPACT for the many stimulating discussions. • EBM acknowledges colleagues at BASF Ag. for stimulating the research, in particular Gerhard Krennrich and Pekka Teppola. • EBM acknowledges Pfizer for providing the data.