650 likes | 789 Views
‘what I am after’ from gR2002. Peter Green, University of Bristol, UK. Why graphical models in R?. Statistical modelling and analysis do not respect boundaries of model classes Software should encourage and support good practice - and graphical models are good practice!
E N D
‘what I am after’ from gR2002 Peter Green, University of Bristol, UK
Why graphical models in R? • Statistical modelling and analysis do not respect boundaries of model classes • Software should encourage and support good practice - and graphical models are good practice! • Data analysis - model-based • R for ‘reference implementation’ of new methodology • Open software
Questions • Scope? • Digram, MIM, CoCo, TETRAD, Hugin, BUGS? • Determined by classes of model, or classes of algorithm? • Market? • Statistics researcher, statistics MSc, arbitrary Excel user? • Delivery? • R package(s), with C code?
Contingency tables Markov chains Spatial statistics Genetics Graphical models Regression AI Statistical physics Sufficiency Covariance selection
Contents • Hierarchical models • Variable-length parameters • Models with undirected edges • Hidden Markov models • Inference on structure • Discrete graphical models/PES • Grappa
Bayesian Hierarchical models properly integrating out all sources of variation
Repeated measures on children's weights • Children i=1,2,…,k have their weights measured on ni occasions, tij,j=1,2,…ni obtaining weights yij. • Suppose that, for each child, we have a linear growth equation, with independent normal errors
Repeated measures on children's weights, continued • Suppose that vary across the population according to • A Bayesian completes the model by specifying priors on
Measurement error Explanatory variables X subject to error - we only observe U on most cases
Contents • Hierarchical models • Variable-length parameters • Models with undirected edges • Hidden Markov models • Inference on structure • Discrete graphical models/PES • Grappa
Mixture modelling k DAG for a mixture model w y
Mixture modelling k DAG for a mixture model length=k w z y value set ={1,2,…,k}
Contents • Hierarchical models • Variable-length parameters • Models with undirected edges • Hidden Markov models • Inference on structure • Discrete graphical models/PES • Grappa
Modelling with undirected graphs Directed acyclic graphs are a natural representation of the way we usually specify a statistical model - directionally: • disease symptom • past future • parameters data ….. However, sometimes (e.g. spatial models) there is no natural direction
Scottish lip cancer data The rates of lip cancer in 56 counties in Scotland have been analysed by Clayton and Kaldor (1987) and Breslow and Clayton (1993) (the analysis here is based on the example in the WinBugs manual)
Scottish lip cancer data (2) The data include • the observed and expected cases (expected numbers based on the population and its age and sex distribution in the county), • a covariate measuring the percentage of the population engaged in agriculture, fishing, or forestry, and • the "position'' of each county expressed as a list of adjacent counties.
Scottish lip cancer data (3) County Obs Exp x SMR Adjacent cases cases (% in counties agric.) 1 9 1.4 16 652.2 5,9,11,19 2 39 8.7 16 450.3 7,10 ... ... ... ... ... ... 56 0 1.8 10 0.0 18,24,30,33,45,55
Model for lip cancer data (1) Graph regression coefficient covariate random spatial effects expected counts observed counts
Model for lip cancer data (2) Distributions • Data: • Link function: • Random spatial effects: • Priors:
Bugs code for lip cancer data model { b[1:regions] ~ car.normal(adj[], weights[], num[], tau) b.mean <- mean(b[]) for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] } alpha1 ~ dnorm(0.0, 1.0E-5) alpha0 ~ dflat() tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau) } Note: declarative, rather than procedural language
Bugs code for lip cancer data model { b[1:regions] ~ car.normal(adj[], weights[], num[], tau) b.mean <- mean(b[]) for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] } alpha1 ~ dnorm(0.0, 1.0E-5) alpha0 ~ dflat() tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau) }
Bugs code for lip cancer data model { b[1:regions] ~ car.normal(adj[], weights[], num[], tau) b.mean <- mean(b[]) for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] } alpha1 ~ dnorm(0.0, 1.0E-5) alpha0 ~ dflat() tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau) }
Bugs code for lip cancer data model { b[1:regions] ~ car.normal(adj[], weights[], num[], tau) b.mean <- mean(b[]) for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] } alpha1 ~ dnorm(0.0, 1.0E-5) alpha0 ~ dflat() tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau) }
Bugs code for lip cancer data model { b[1:regions] ~ car.normal(adj[], weights[], num[], tau) b.mean <- mean(b[]) for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] } alpha1 ~ dnorm(0.0, 1.0E-5) alpha0 ~ dflat() tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau) }
WinBugs for lip cancer data Dynamic traces for some parameters:
WinBugs for lip cancer data Posterior densities for some parameters:
Contents • Hierarchical models • Variable-length parameters • Models with undirected edges • Hidden Markov models • Inference on structure • Discrete graphical models/PES • Grappa
Hidden Markov models e.g. Hidden Markov chain (DLM, state space model) z0 z1 z2 z3 z4 hidden y1 y2 y3 y4 observed
Hidden Markov models • Richardson & Green (2000) used a hidden Markov random field model for disease mapping observed incidence relative risk parameters expected incidence hidden MRF
DAG for Potts-based Hidden Markov random field length=k spatial fields
Larynx cancer in females in France SMRs
Ion channel signal restoration Hodgson, JRSS(B), 1999
DAG for alternating renewal process model for ion channel data Sojourn time parameters Binary signal Data
Contents • Hierarchical models • Variable-length parameters • Models with undirected edges • Hidden Markov models • Inference on structure • Discrete graphical models/PES • Grappa
Ion channel model choice Hodgson and Green, Proc Roy Soc Lond A, 1999
Example: hidden continuous time models O2 O1 C1 C2 C1 C2 C3 O1 O2
DAG for hidden CTMC model for ion channel data Model indicator Transition rates Binary signal Data
Ion channelmodel DAG model indicator transition rates hidden state binary signal levels & variances data
model indicator C1 C2 C3 O1 O2 transition rates hidden state binary signal levels & variances data * * * * * * * * * * *
Posterior model probabilities .41 O1 C1 .12 O2 O1 C1 .36 O1 C1 C2 O2 O1 C1 C2 .10
Simultaneous inference on parameters and structure of CI graph : Bayesian approach: Place prior on all graphs, and conjugate prior on parameters (hyper-Markov laws, Dawid & Lauritzen), then use MCMC to update both graphs and parameters to simulate posterior distribution
Graph moves Giudici & Green (Biometrika, 1999) develop a Bayesian methodology for model selection in Gaussian models, assuming decomposability (= graph triangulated = no chordless -cycles) 5 7 6 4 1 2 3
Graph moves We can traverse graph space by adding and deleting single edges Some are OK, but others make graph non-decomposable 5 7 6 4 1 2 3
Graph moves Frydenberg & Lauritzen (1989) showed that all decomposable graphs are connected by single-edge moves Can we test for maintaining decomposability before committing to making the change? 5 7 6 4 1 2 3
Deleting edges? Deleting an edge maintains decomposability if and only if it is contained in exactly one clique of the current graph (Frydenberg & Lauritzen) 5 7 6 4 1 2 3
Adding edges? (Giudici & Green) Adding an edge (a,b) maintains decomposability if and only if either: • a and b are in different connected components, or • there exist sets R and T such that aR and bT are cliques and RT is a separator on the path in the junction tree between them 5 7 6 4 1 2 3
5 7 6 Once the test is complete, actually committing to adding or deleting the edge is little work 4 1 2 3 267 236 3456 26 36 2 12