210 likes | 382 Views
The Nested Dirichlet Process. Paper by Abel Rodriguez, David B. Dunson, and Alan E. Gelfand, Submitted to JASA 2006. Duke University Machine Learning Group Presented by Kai Ni Nov. 10, 2006. Outline. Introduction Nested Dirichlet process Application on haplotype inference. Motivation.
E N D
The Nested Dirichlet Process Paper by Abel Rodriguez, David B. Dunson, and Alan E. Gelfand, Submitted to JASA 2006 Duke University Machine Learning Group Presented by Kai Ni Nov. 10, 2006
Outline • Introduction • Nested Dirichlet process • Application on haplotype inference
Motivation • General problem – Extending the Dirichlet Process to ccommodate multiple dependent distributions. • Methods • Inducing dependence through a shared source. For example, the dependent Dirichlet process (DDP) and the hierarchical Dirichlet process (HDP). • Inducing dependence through linear combinations of realizations of independent Dirichlet processes. For example, Muller (2004) defines the distribution of each group as the mixture of a global component and a local component.
Background • The paper is motivated by two related problems: clustering probability distributions and simultaneous multilevel clustering in nested setting. • Considered an example of hospital analysis: • In assessing quality of care, we need cluster centers according to the distribution of patients outcomes and identify outlying centers. • Also want to simultaneously cluster patients within the centers, and borrow information across centers that have similar clusters.
The Dirichlet process • A single clustering problem can be analyzed as a Dirichlet processes (DP). The stick-breaking construction is usually the starting point of analysis: • If yields Pitman-Yor process. If a = 0 and b = a resulting in the standard DP.
The nested Dirichlet process mixture • Suppose yij, for i = 1, …, nj are observations within center j. We assume exchangeability for centers, with • A collection of distributions {F1, …, FJ} is said to follow a Nested Dirichlet Processes Mixture if
The nested Dirichlet process • The collection {G1, …, GJ}, used as the mixing distribution, is said to follow a Nested Dirichlet Process with parameters • From the construction, we have , and marginally, for every j. • We have the properties for each Gj with
Prior correlation • The prior correlation between two distribuitions Gj and Gj’ is • The prior correlation between draws from the process is • The correlation within center is larger than the one between centers. • Generalized to three standard cases when
Truncation error example for nDP(3,3,H) • As the number of groups J increases, K needs to be increased. A typical choice will be K = 35 and L = 55;
Simulated data • Showing the discriminating capability of the nDP and its ability to provide more accurate density estimates.
Density estimation result • Case (a) – using the nDP and Case (b) – using the DPM. • The nDP captures the small mode better and also emphasizes the importance of the main mode. • Entropy of the estimation (red) to the true distribution (black) under the nDP is 0.011, while under the DMP it was 0.017.
Health care quality in United States • Data – 3077 hospitals in 51 territories (50 states + DC). Number of hospitals per state varies as well as the number of patients per hospital vary. • Four covariates are available for each center: type of hospital, ownership, whether the hospital provides emergency services and whether it has an accreditation. • We are interested in clustering states according to their quality. After adjusting for the effect of available covariates, we getting the main-effects ANOVA and use the nDP to model the state-specific error distributions.
Conclusion • The author proposed the nested Dirichlet process to simultaneously cluster groups and observations within groups. • The groups are clustered by their entire distribution rather than by particular features of it. • While being non-parametric, the nDP encompasses a number of typical parametric and non-parametric models as limiting cases.