‘what I am after’ from gR2002

‘what I am after’ from gR2002 Peter Green, University of Bristol, UK

Why graphical models in R? • Statistical modelling and analysis do not respect boundaries of model classes • Software should encourage and support good practice - and graphical models are good practice! • Data analysis - model-based • R for ‘reference implementation’ of new methodology • Open software

Questions • Scope? • Digram, MIM, CoCo, TETRAD, Hugin, BUGS? • Determined by classes of model, or classes of algorithm? • Market? • Statistics researcher, statistics MSc, arbitrary Excel user? • Delivery? • R package(s), with C code?

Contingency tables Markov chains Spatial statistics Genetics Graphical models Regression AI Statistical physics Sufficiency Covariance selection

Contents • Hierarchical models • Variable-length parameters • Models with undirected edges • Hidden Markov models • Inference on structure • Discrete graphical models/PES • Grappa

Bayesian Hierarchical models properly integrating out all sources of variation

Repeated measures on children's weights • Children i=1,2,…,k have their weights measured on ni occasions, tij,j=1,2,…ni obtaining weights yij. • Suppose that, for each child, we have a linear growth equation, with independent normal errors

Repeated measures on children's weights, continued • Suppose that vary across the population according to • A Bayesian completes the model by specifying priors on

Graph for children’s weights

Measurement error Explanatory variables X subject to error - we only observe U on most cases

Mixture modelling k DAG for a mixture model w  y

Mixture modelling k DAG for a mixture model length=k w  z y value set ={1,2,…,k}

Measurement error using mixture model for population

Modelling with undirected graphs Directed acyclic graphs are a natural representation of the way we usually specify a statistical model - directionally: • disease  symptom • past  future • parameters  data ….. However, sometimes (e.g. spatial models) there is no natural direction

Scottish lip cancer data The rates of lip cancer in 56 counties in Scotland have been analysed by Clayton and Kaldor (1987) and Breslow and Clayton (1993) (the analysis here is based on the example in the WinBugs manual)

Scottish lip cancer data (2) The data include • the observed and expected cases (expected numbers based on the population and its age and sex distribution in the county), • a covariate measuring the percentage of the population engaged in agriculture, fishing, or forestry, and • the "position'' of each county expressed as a list of adjacent counties.

Scottish lip cancer data (3) County Obs Exp x SMR Adjacent cases cases (% in counties agric.) 1 9 1.4 16 652.2 5,9,11,19 2 39 8.7 16 450.3 7,10 ... ... ... ... ... ... 56 0 1.8 10 0.0 18,24,30,33,45,55

Model for lip cancer data (1) Graph regression coefficient covariate random spatial effects expected counts observed counts

Model for lip cancer data (2) Distributions • Data: • Link function: • Random spatial effects: • Priors:

Bugs code for lip cancer data model { b[1:regions] ~ car.normal(adj[], weights[], num[], tau) b.mean <- mean(b[]) for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] } alpha1 ~ dnorm(0.0, 1.0E-5) alpha0 ~ dflat() tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau) } Note: declarative, rather than procedural language

Bugs code for lip cancer data model { b[1:regions] ~ car.normal(adj[], weights[], num[], tau) b.mean <- mean(b[]) for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] } alpha1 ~ dnorm(0.0, 1.0E-5) alpha0 ~ dflat() tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau) }

WinBugs for lip cancer data Dynamic traces for some parameters:

WinBugs for lip cancer data Posterior densities for some parameters:

Hidden Markov models e.g. Hidden Markov chain (DLM, state space model) z0 z1 z2 z3 z4 hidden y1 y2 y3 y4 observed

Hidden Markov models • Richardson & Green (2000) used a hidden Markov random field model for disease mapping observed incidence relative risk parameters expected incidence hidden MRF

DAG for Potts-based Hidden Markov random field length=k spatial fields

Distributions for Potts-based Hidden Markov random field

Larynx cancer in females in France SMRs

Ion channel signal restoration Hodgson, JRSS(B), 1999

DAG for alternating renewal process model for ion channel data Sojourn time parameters Binary signal Data

Ion channel model choice Hodgson and Green, Proc Roy Soc Lond A, 1999

Example: hidden continuous time models O2 O1 C1 C2 C1 C2 C3 O1 O2

DAG for hidden CTMC model for ion channel data Model indicator Transition rates Binary signal Data

Ion channelmodel DAG model indicator transition rates hidden state binary signal levels & variances data

model indicator C1 C2 C3 O1 O2 transition rates hidden state binary signal levels & variances data * * * * * * * * * * *

Posterior model probabilities .41 O1 C1 .12 O2 O1 C1 .36 O1 C1 C2 O2 O1 C1 C2 .10

Simultaneous inference on parameters and structure of CI graph : Bayesian approach: Place prior on all graphs, and conjugate prior on parameters (hyper-Markov laws, Dawid & Lauritzen), then use MCMC to update both graphs and parameters to simulate posterior distribution

Graph moves Giudici & Green (Biometrika, 1999) develop a Bayesian methodology for model selection in Gaussian models, assuming decomposability (= graph triangulated = no chordless -cycles) 5 7 6 4 1 2 3

Graph moves We can traverse graph space by adding and deleting single edges Some are OK, but others make graph non-decomposable 5 7 6 4 1 2 3

Graph moves Frydenberg & Lauritzen (1989) showed that all decomposable graphs are connected by single-edge moves Can we test for maintaining decomposability before committing to making the change? 5 7 6 4 1 2 3

Deleting edges? Deleting an edge maintains decomposability if and only if it is contained in exactly one clique of the current graph (Frydenberg & Lauritzen) 5 7 6 4 1 2 3

Adding edges? (Giudici & Green) Adding an edge (a,b) maintains decomposability if and only if either: • a and b are in different connected components, or • there exist sets R and T such that aR and bT are cliques and RT is a separator on the path in the junction tree between them 5 7 6 4 1 2 3

5 7 6 Once the test is complete, actually committing to adding or deleting the edge is little work 4 1 2 3 267 236 3456 26 36 2 12

‘what I am after’ from gR2002

‘what I am after’ from gR2002

Presentation Transcript

gR2002