1 / 17

5. Multiway calibration

5. Multiway calibration. Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP. Multiway regression problems e.g. batch reaction monitoring. Process measurements. Product quality. Y. X. batch. batch. time. product quality. process variable.

havily
Download Presentation

5. Multiway calibration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 5. Multiway calibration Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

  2. Multiway regression problems e.g. batch reaction monitoring Process measurements Product quality Y X batch batch time product quality process variable

  3. Multiway regression problems e.g. tandem mass spectrscopy MS-MS spectra Compound concentrations X1 X2 X3 X4 X5 sample daughter ion m/z samples compound parent ion m/z

  4. Some terminology Cannot handle interferents Univariate calibration (OLS – ordinary least squares) zero-order Can handle interferents if they are present in the training set Multivariate calibration (ridge regression, PCR, PLS etc.) first-order N-PLS(?) Can handle unknown interferents (although see work of K.Faber) Second-order advantage (PARAFAC, restricted Tucker, GRAM, RBL etc.) second-order

  5. Multiway calibration methods • PARAFAC (already discussed on first day) • (Unfold-PLS) • Multiway PCR • N-PLS • MCovR (multiway covariates regression) (see work of Smilde & Gurden) • GRAM, NBRA, RBL (see work of Kowalski et al.)

  6. Unfold-PLS • Matricize (or ‘unfold’) the data and use standard two-way PLS: X X1 ... XI Y I I I K JK J M • But if a multiway structure exists in the data, multiway methods have some important advantages!!

  7. Standard PCR for X (IJ) and y (I 1). PT • Calculate PCA model of X: • X = TPT + E • Use PCA scores for ordinary regression: • y = Tb + E • b = (TTT)-1TTy • Calculate PCA model of X: • X = TPT + E = + X E T b Y Two-way PCR • Calculate PCA model of X: • X = TPT + E • Use PCA scores for ordinary regression: • y = Tb + E • b = (TTT)-1TTy • Make predictions for new samples: • Tnew = XnewP • ynew = Tnew b

  8. Multiway PCR for X (IJ K) and y (I 1). CT • Calculate multiway model: • X = A(C||B)T + E • Use scores for regression: • y = A bPCR + E • bPCR = (ATA)-1ATy • Calculate multiway model: • X = A(C||B)T + E BT = + X E A bPCR Y Multiway PCR • Calculate multiway model: • X = A(C||B)T + E • Use scores for regression: • y = A bPCR + E • bPCR = (ATA)-1ATy • Make predictions for new samples: • Anew = XnewP(PTP)-1 • where P = (C||B) • ynew= Anew bPCR

  9. N-PLS • N-PLS is a direct extension of standard two-way PLS for N-way arrays. • The advantages of N-PLS are the same as for any multiway analysis: • a more parsimonious model • loadings which are easier to plot and interpret

  10. The standard two-way PLS algorithm (see ‘Multivariate Calibration’ by Martens and Næs): The N-PLS algorithm (R.Bro) uses PARAFAC-type loadings, but is otherwise very similar 1. 1. 2. 2. 3. 3. 4. 4. N-PLS

  11. N-PLS graphic(taken from R.Bro)

  12. Restricted Tucker, GRAM, RBL, NBRA etc. • for more specialized use • second-order advantage, i.e. able to handle unknown interferents N M standard, N 1 0 restricted loadings, A mixture, N + M Other methods • Multiway covariates regression (MCovR) • different to PLS-type models • choice of structure on X (PARAFAC, Tucker, unfold etc.) • sometimes loadings are easier to interpret

  13. Conclusions • There are a number of different calibration methods for multiway data. • N-PLS is a extension of two-way PLS for multiway data. • All the normal guidelines for multivariate regression still apply!! • watch out for outliers • don’t apply the model outside of the calibration range

  14. Remove outlier Outliers (1) • Outliers are objects which are very different from the rest of the data. These can have a large effect on the regression model and should be removed. bad experiment

  15. 6 4 2 Scores PC 2 0 -2 -4 -6 -8 Scores PC 1 -8 -6 -4 -2 0 2 4 6 8 Outliers (2) • Outliers can also be found in the model space or in the residuals.

  16. Model extrapolation... • Univariate example: mean height vs age of a group of young children • A strong linear relationship between height and age is seen. • For young children, height and age are correlated. Moore, D.S. and McCabe G.P., Introduction to the Practice of Statistics (1989).

  17. ...but is not valid for 30 year olds! Linear model was valid for this age range... ... can be dangerous!

More Related