920 likes | 1.73k Views
INTRODUCTION TO CHEMOMETRICS Richard G Brereton Centre for Chemometrics University of Bristol United Kingdom r.g.brereton@bris.ac.uk Phone +44-117-9287658. History Applications How chemometrics relates to other disciplines Hierarchy of users Four building blocks Methods
E N D
INTRODUCTION TO CHEMOMETRICS Richard G Brereton Centre for Chemometrics University of Bristol United Kingdom r.g.brereton@bris.ac.uk Phone +44-117-9287658
History • Applications • How chemometrics relates to other disciplines • Hierarchy of users • Four building blocks • Methods • Experimental design • Pattern Recognition • Calibration • Software • Instrumentation • Applications • Finding out about Chemometrics
History • Applications • How chemometrics relates to other disciplines • Hierarchy of users • Four building blocks • Methods • Experimental design • Pattern Recognition • Calibration • Software • Instrumentation • Applications • Finding out about Chemometrics
HISTORICAL • Name first proposed in early 1970s by Swedish organic chemist S.Wold. • International Chemometrics Society - 1970s. • International meeting - Cosenza 1983 • Journals : 1986 (Chemometrics and Intelligent Laboratory Systems) and 1987 (J Chemometrics) • Books : mid 1980s • Courses : mainly at continuing professional development level in late 1980s.
DEFINITION • No universal definition • Catalysed by different groups of people • Modern day – huge levels of data from instruments • Data analysis in the laboratory, also chemical plant
Many groups and philosophies. • As an aid to NIR. • As a very broadly based subject relevant in biology, chemistry, geology, engineering etc. • As an aid to quantitative analytical chemistry including chromatography and many forms of spectroscopy. • As an throughout chemistry e.g. QSAR, molecular mechanics, spectroscopy. • Analytical chemists. • Statisticians. • Chemical engineers.
HISTORICAL CHEMOMETRICS • 1980s and early 1990s • NIR calibration • UV/vis calibration • Simple PCA e.g. in chromatography
History • Applications • How chemometrics relates to other disciplines • Hierarchy of users • Four building blocks • Methods • Experimental design • Pattern Recognition • Calibration • Software • Instrumentation • Applications • Finding out about Chemometrics
APPLICATIONS • Biggest growth area in past 10 years • Classical chemometrics • A few classical applications, mainly related to the interests and funding of pioneers • Process control and analysis • Chromatographic optimisation • Food analysis
Big growth • Forensic science • Biology – metabolomics etc. • Clinical science – disease diagnosis • Materials characterisation • Environmental monitoring • Fermentation technology • Reaction monitoring • Synthesis optimisation • Analytical chemistry
History • Applications • How chemometrics relates to other disciplines • Hierarchy of users • Four building blocks • Methods • Experimental design • Pattern Recognition • Calibration • Software • Instrumentation • Applications • Finding out about Chemometrics
HOW CHEMOMETRICS RELATES TO OTHER DISCIPLINES The common theme The computerised instrument Huge quantities of data available, millions of pieces of information every day. How to cope? Use statistical and computational methods.
Mathematics Mathematics Organic Organic Chemistry Chemistry Statistics Statistics Biology Biology Analytical Analytical Computing Computing Applications Applications Industrial Industrial CHEMOMETRICS CHEMOMETRICS Chemistry Chemistry among others among others p Theoretical Theoretical Pharmaceuticals Engineering Engineering and Physical and Physical Chemistry Chemistry
CHEMOMETRICS IS NOT A UNITARY SUBJECT LIKE ORGANIC CHEMISTRY In organic chemistry, a solid skill base that all organic chemists have is built upon over the years. All organic chemists have roughly the same skill base. More experience ones have a bigger knowledge base. Good organic chemists read the literature a lot and know many reactions well.
ORGANIC CHEMISTRY IS BASICALLY A KNOWLEDGE BASED SUBJECT – certain basic skills and then increase the knowledge. CHEMOMETRICS IS MORE A SKILLED BASED SUBJECT – not necessary to have a huge knowledge of named methods, a very few basic principles but one must have hands-on experience to expand one’s problem solving ability.
DIFFERENT GROUPS HAVE DIFFERENT BACKGROUNDS AND EXPECTATIONS AS TO HOW CHEMOMETRICS SHOULD BE INTRODUCED Statisticians want to start with distributions, hypothesis tests etc. and build up from there. They are dissatisfied if the maths is not explained. Chemical engineers like to start with linear algebra such as matrices, and expect a mathematical approach but are not always so interested in distributions etc.
Computer scientists are often most interested in algorithms. Analytical chemists often know a little statistics but are not necessarily very confident in maths and algorithms so like to approach this via statistical analytical chemistry. Difficult group because the ability to run instruments is not necessarily an ability in maths and computing. Organic chemists do not like maths and want automated packages they can use. They often require elaborate courses that avoid matrices. The course an organic chemist would regard is good is one a statistician would regard as bad.
History • Applications • How chemometrics relates to other disciplines • Hierarchy of users • Four building blocks • Methods • Experimental design • Pattern Recognition • Calibration • Software • Instrumentation • Applications • Finding out about Chemometrics
Mathematical sophistication Applications Theoretical statisticians. First applications to chemical systems. Applying and modifying methods, developing software. Environmental, clinical, food, industrial, biological, physical, organic chemistry etc. etc. HIERARCHY OF USERS
Statisticians and Chemical and Computer scientists Process engineers Analytical, organic, environmental, biological, physical, industrial, archaeological, process etc.chemists
THE ULTIMATE DREAM • A lower level where no knowledge of chemometrics is required – good software. • E.g. technician in warehouse looking at quality of drug • Nurse in hospital looking at diagnosis • Operator in manufacturing plant looking at whether product is OK.
How you introduce chemometrics methods to your organisation depends on your size and needs. • A large organisation. • Employ one or more chemometrics / data analysis experts, probably Matlab proficient, probably with Ph.D. from one of a small number of groups that graudate specialists. • Train a wider range of people in chemometrics but not to expert level, able to use commercial packages and to consult the expert when needed. • Contract out method development, research and any customised software.
Smaller organisations have different choices. • Beware “a little bit of knowledge” can be dangerous. • Can you afford a full time employee? Is there enough work for a full time employee? • What level of expertise is required? Often it cannot be solved by someone doing an afternoon a week, don’t get obsessed by software. • Training courses and software, these will solve the problem of some users and are important to get people to recognise where chemometrics can be applied but are not in themselves sufficient. • Need higher level of expertise occasionally. • Consultancy. • Externally funded projects, with experts. • Part-time expert, who has other jobs.
Getting started To get started you do need to tap into the “middle” of the triangle. SOME EXPERTISE NEEDED AT THIS LEVEL
History • Applications • How chemometrics relates to other disciplines • Hierarchy of users • Four building blocks • Methods • Experimental design • Pattern Recognition • Calibration • Software • Instrumentation • Applications • Finding out about Chemometrics
FOUR BUILDING BLOCKS • Methods. • Software. • Instrumental techniques. • Applications.
History • Applications • How chemometrics relates to other disciplines • Hierarchy of users • Four building blocks • Methods • Experimental design • Pattern Recognition • Calibration • Software • Instrumentation • Applications • Finding out about Chemometrics
METHODS • Experimental design • Pattern recognition • Calibration
MOTIVATIONS FOR DESIGN • Screening • Saving time • Quantitative modelling • Optimisation
WHY DESIGN EXPERIMENTS? Example : Optimisation of a reaction with pH and temperature. Can we find the combination of pH and temperature that produces the best yield?
IMAGINARY RESPONSE SURFACE • We want to find optimum • Response surface unknown • Mathematical model may not be of interest in its own right • Not necessarily interested in underlying molecular mechanism • Reproducibility and flat optimum
DIFFICULTY Interactions – the response for each factor is not independent. The optimum temperature at pH 5 differs from that at pH 6.
How to be on the safe side? • Grid search. 10 pHs, 10 temperatures, 100 experiments. • Big grid. Then smaller grid.
PROBLEMS • Time consuming and expensive. • Many experiments we are almost certain are not near at optimum so are obviously a waste of time • Reproducibility and experimental error
WHAT DO WE DO? We need rules! Formal experimental design
Screening • Factorial designs • Partial factorials and Plackett-Burman designs • Modelling and optimisation • Response surface designs • Mixture designs
PATTERN RECOGNITION • Grouping of objects e.g. how similar is the behaviour of compounds, how similar are products. Also PCA used a great deal to visualise changes e.g. in a reaction or product. • “PCA” • Can we classify products into acceptable or unacceptable? • “Discriminant analysis” and “Cluster analysis” • Exploratory Data Analysis • Unsupervised Pattern Recognition • Supervised Pattern Recognition
Exploratory data analysis • e.g. PCA – Principal Components and Factor Analysis • Looking at relationships • between samples • patients, food samples, organisms, chromatographic columns, wood, spectra, people • between variables • compound concentrations, spectral peaks, expenditure, chromatographic tests, elemental compositions
Families : MA (manual), EM (Employee), CA (Manager) together with number of children and monthly expenditure
Unsupervised pattern recognition • Dendrograms • Example : Toxicology • Urine samples • Discrimination between acute or chronic intoxication helps elucidating the source of contamination and may suggest the best cure for an effective remediation • Use capillary electrophoresis and then compare chromatograms.
Supervised pattern recognition : classification Examples Tablets, can we class into origins and can we detect adulteration from NIR spectra? Class modelling of mussels, can we find which come from polluted site from GC? Detailed mathematical model
Multivariate data : several measurements per class Example – Fisher Iris data – four measurements per iris Petal width, petal length, sepal width, sepal length 150 Irises, divided into 50 of each species I. Setosa I. Versicolor I. Verginica
Calibration Quantitative estimation. Especially mixtures. Estimation of bulk parameters. Multivariate process control. “PLS”