1 / 37

Structure and Uncertainty

Explore the relationship between structure and uncertainty in statistics and science through graphical models, mathematical modeling, algorithms, and inference. Understand complex systems through global models built from small pieces. Discover the power of Bayesian structured modeling in dealing with uncertainty. Learn about algorithms for probability and likelihood calculations, as well as the use of Markov chain Monte Carlo, probability propagation, and message passing. Discover the success stories of structured systems in genomics, spatial statistics, and temporal problems.

levelyn
Download Presentation

Structure and Uncertainty

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Structure and Uncertainty Peter Green, University of Bristol, 10 July 2003

  2. Statistics and science “If your experiment needs statistics, you ought to have done a better experiment” Ernest Rutherford (1871-1937)

  3. Graphical models Mathematics Modelling Algorithms Inference

  4. Markov chains Spatial statistics Genetics Regression AI Statistical physics Sufficiency Covariance selection Contingency tables Graphical models

  5. 1. Modelling Mathematics Modelling Algorithms Inference

  6. Structured systems A framework for building models, especially probabilistic models, for empirical data Key idea - • understand complex system • through global model • built from small pieces • comprehensible • each with only a few variables • modular

  7. AB AO AO OO OO Mendelian inheritance - a natural structured model A AB A O O Mendel

  8. Ion channelmodel model indicator transition rates hidden state Hodgson and Green, Proc Roy Soc Lond A, 1999 binary signal levels & variances data

  9. model indicator C1 C2 C3 O1 O2 transition rates hidden state binary signal levels & variances data * * * * * * * * * * *

  10. Gene expression using Affymetrix chips * * * * * Zoom Image of Hybridised Array Hybridised Spot Single stranded, labeled RNA sample Oligonucleotide element 20µm Millions of copies of a specific oligonucleotide sequence element Expressed genes Approx. ½ million different complementary oligonucleotides Non-expressedgenes 1.28cm Slide courtesy of Affymetrix Image of Hybridised Array

  11. Gene expression is a hierarchical process • Substantive question • Experimental design • Sample preparation • Array design & manufacture • Gene expression matrix • Probe level data • Image level data

  12. Mapping of rare diseases using Hidden Markov model Larynx cancer in females in France, 1986-1993 (standardised ratios) Posterior probability of excess risk G & Richardson, 2002

  13. Probabilistic expert systems

  14. 2. Mathematics Mathematics Modelling Algorithms Inference

  15. C D F B E A Graphical models Use ideas from graph theory to • represent structure of a joint probability distribution • by encoding conditional independencies

  16. Where does the graph come from? • Genetics • pedigree (family connections) • Lattice systems • interaction graph (e.g. nearest neighbours) • Gaussian case • graph determined by non-zeroes in inverse variance matrix

  17. A B C D A B C D Inverse of (co)variance matrix: independent case A B C D

  18. A B C D A B C D Inverse of (co)variance matrix: dependent case non-zero  non-zero A B C D Few links implies few parameters - Occam’s razor

  19. Conditional independence • X and Z are conditionally independent given Y if, knowing Y, discovering Z tells you nothing more about X: p(X|Y,Z) = p(X|Y) • X  Z  Y X Y Z

  20. Conditional independence as seen in data on perinatal mortality vs. ante-natal care…. Does survival depend on ante-natal care? .... what if you know the clinic?

  21. Conditional independence survival ante clinic survivaland clinicaredependent andanteandclinicaredependent but survival and ante are CI given clinic

  22. C D F B E A Conditional independence provides a mathematical basis for splitting up a large system into smaller components

  23. C D D F B E B E A

  24. 3. Inference Mathematics Modelling Algorithms Inference

  25. Bayesian paradigm in structured modelling • ‘borrowing strength’ • automatically integrates out all sources of uncertainty • properly accounting for variability at all levels • including, in principle, uncertainty in model itself • avoids over-optimistic claims of certainty

  26. Bayesian structured modelling • ‘borrowing strength’ • automatically integrates out all sources of uncertainty • … for example in forensic statistics with DNA probe data…..

  27. (thanks to J Mortera)

  28. 4. Algorithms Mathematics Modelling Algorithms Inference

  29. Algorithms for probability and likelihood calculations Exploiting graphical structure: • Markov chain Monte Carlo • Probability propagation (Bayes nets) • Expectation-Maximisation • Variational methods

  30. Markov chain Monte Carlo • Subgroups of one or more variables updated randomly, • maintaining detailed balance with respect to target distribution • Ensemble converges to equilibrium = target distribution ( = Bayesian posterior, e.g.)

  31. Markov chain Monte Carlo ? ? Updating - need only look at neighbours

  32. form junction tree 267 236 3456 26 36 2 12 Probability propagation 5 7 6 4 1 2 3

  33. Message passing in junction tree root root

  34. Message passing in junction tree root root

  35. Structured systems’ success stories include... • Genomics & bioinformatics • DNA & protein sequencing, gene mapping, evolutionary genetics • Spatial statistics • image analysis, environmetrics, geographical epidemiology, ecology • Temporal problems • longitudinal data, financial time series, signal processing

  36. http://www.stats.bris.ac.uk/~peter P.J.Green@bristol.ac.uk …thanks to many

More Related