1 / 30

The potential of Functional Data Analysis for Chemometrics

Dirk De Becker, Wouter Saeys, Bart De Ketelaere and Paul Darius. The potential of Functional Data Analysis for Chemometrics. The Potential of FDA for Chemometrics. Introduction to FDA Introduction to Chemometrics Using FDA in chemometrics For prediction For Analysis Of Variance

lorenat
Download Presentation

The potential of Functional Data Analysis for Chemometrics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dirk De Becker, Wouter Saeys, Bart De Ketelaere and Paul Darius The potential of Functional Data Analysis for Chemometrics

  2. The Potential of FDA for Chemometrics • Introduction to FDA • Introduction to Chemometrics • Using FDA in chemometrics • For prediction • For Analysis Of Variance • Conclusions

  3. What is Functional Data Analysis? • Developed by Ramsay & Silverman (1997) • Analyse Data • By approximating it • Using some kind of functional basis • Mainly for longitudinal data • High correlation between neighbouring datapoints

  4. Why use FDA? • Data as single entity <-> individual observations • Make a function of your data • Derivatives • Reduce the amount of data • Noise -> smoothing • Impose some known properties on the data • Monotonicity, non-negativeness, smoothness, ...

  5. Basis Functions? • Polynomials: 1, t, t², t³, ... • Fourier: 1, sin(ωt), cos(ωt), sin(2ωt), cos(2ωt) • Splines • Wavelets • Depends on your data

  6. Chemometrics • Measure optical properties of material • Transmission or reflection of light • At a large number of wavelengths • Use these properties to predict something else

  7. Why Chemometrics? • Fast • Cheap • Non-destructive • Environment-friendly

  8. Classical methods • Ignore correlation between neighbouring wavelengths:

  9. FDA in chemometrics • NIR spectra • Absorption peaks • Width and height • Basis: B-splines • ~ shape of absorption peaks • Preserve the vicinity constraint

  10. Spline Functions • Piecewise joining polynomials of order m • Fast evaluation • Continuity of derivatives • Up to order m-2 • In L interior knots • Degrees of freedom: L + m • Flexible

  11. Constructing a spline basis • Order • What to use the model for • Mostly cubic splines (order 4) • Number and position of knots • Use enough • Look at the data • !Overfitting

  12. Position of knots More variation -> more knots

  13. B-spline approximation

  14. FDA for prediction Functional regression models P-Spline Regression (Marx and Eilers) Non-Parametric Functional Data Analysis (Ferraty and Vieu)

  15. Functional Regression Models Project spectra to spline basis Apply Multivariate Linear Regression to the spline coefficients Great reduction in system complexity Natural shape of absorption peaks is used

  16. Functional Regression Models: case study 420 samples of hog manure Reflectance spectra Total nitrogen (TN) and dry matter (DM) content PLS and Functional Regression applied

  17. Functional Regression: case study (ct'd)

  18. Functional Regression: case study: results

  19. P-Spline Regression (PSR) • By Marx and Eilers • Construct with B-splines: • Use roughness parameter on • Minimize • Full spectra are used for regression

  20. P-Spline Regression: case study • 121 samples of seed pills • y is % humidity • PLS: RMSEP = 1,19 • PSR: RMSEP = 1,115 • # B-spline coefficients = 7 • λ= 0.001

  21. Non-Parametric Functional Data Analysis By F. Ferraty and P. Vieu No regression model is involved Prediction by applying local kernel functions in function space So far, no good results yet

  22. FDA in Anova setting: FANOVA • ANOVA: • “Study the relation between a response variable and one or more explanatory variables” • is overall mean • are the effects of belonging to a group g • are residuals

  23. FANOVA: theory • Constraint: • Introduce so that • Introduce functional aspect: • Constraint: introduce

  24. FANOVA: goal and solution • Goal: estimate from • Solution:

  25. FANOVA: significance testing • Locally: • Globally:

  26. FANOVA: case study • Spectra of manure • 4 types of animals: dairy, beef, calf, hog • 3 ambient temperatures: 4°C, 12°C, 20°C • 3 sample temperatures: 4°C, 12°C, 20°C • 9 replicates • => 324 samples • Model:

  27. FANOVA: case study (ct'd)

  28. FANOVA: case study (ct'd)

  29. Conclusions Splines are a good basis for fitting spectral data Using FDA, it is possible to include vicinity constraint in prediction models in chemometrics FANOVA is a good tool to explore the variance in spectral data

More Related