1 / 23

Learning with Tree-averaged Densities and Distributions

Learning with Tree-averaged Densities and Distributions. Sergey Kirshner Alberta Ingenuity Centre for Machine Learning, Department of Computing Science, University of Alberta, Canada. NIPS 2007 Poster W12. December 5, 2007. Overview. Want to fit density to complete multivariate data

nirav
Download Presentation

Learning with Tree-averaged Densities and Distributions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning with Tree-averaged Densities and Distributions Sergey Kirshner Alberta Ingenuity Centre for Machine Learning, Department of Computing Science, University of Alberta, Canada NIPS 2007 Poster W12 December 5, 2007

  2. Overview • Want to fit density to complete multivariate data • New density estimation model based on averaging over tree-dependence structures • Distribution = Univariate Marginals + Copula • Bayesian averaging over tree-structured copulas • Efficient parameter estimation for tree-averaged copulas • Can solve problems with 10-30 dimensions Learning with Tree-averaged Densities and Distributions

  3. Most Popular Distribution… • Interpretable • Closed under taking marginals • Generalizes to multiple dimensions • Models pairwise dependence • Tractable • 245 pages out of 691 from Continuous Multivariate Distributions by Kotz, Balakrishnan, and Johnson Learning with Tree-averaged Densities and Distributions

  4. What If the Data Is NOT Gaussian? Learning with Tree-averaged Densities and Distributions

  5. 1/n 1/n Curse of Dimensionality [Bellman 57] nd cells V[-2,2]d ≈ 0.9545d Learning with Tree-averaged Densities and Distributions

  6. Avoiding the Curse: Step 1Separating Univariate Marginals univariate marginals, independent variables, multivariate dependence term, copula Learning with Tree-averaged Densities and Distributions

  7. Monotonic Transformation of the Variables Learning with Tree-averaged Densities and Distributions

  8. Copula Copula C is a multivariate distribution (cdf) defined on a unit hypercube with uniform univariate marginals: Learning with Tree-averaged Densities and Distributions

  9. Sklar’s Theorem [Sklar 59] = + Learning with Tree-averaged Densities and Distributions

  10. Example: Bivariate Gaussian Copula Learning with Tree-averaged Densities and Distributions

  11. Useful Properties of Copulas • Preserves concordance between the variables • Rank-based measure of dependence • Preserves mutual information • Can be viewed as a canonical form of a multivariate distribution for the purpose of the estimation of multivariate dependence Learning with Tree-averaged Densities and Distributions

  12. Copula Density Learning with Tree-averaged Densities and Distributions

  13. Separating Univariate Marginals • Fit univariate marginals (parametric or non-parametric) • Replace data points with cdf’s of the marginals • Estimate copula density Inference for the margins [Joe and Xu 96]; canonical maximum likelihood [Genest et al 95] Learning with Tree-averaged Densities and Distributions

  14. What Next? • Aren’t we back to square one? • Still estimating multivariate density from data • Not quite • All marginals are fixed • Lots of approaches for copulas • Vast majority focus on bivariate case • Design models that use only pairs of variables Learning with Tree-averaged Densities and Distributions

  15. x1 x2 x6 x3 x5 x4 Tree-Structured Densities Learning with Tree-averaged Densities and Distributions

  16. Tree-Structured Copulas Learning with Tree-averaged Densities and Distributions

  17. a4 a4 a2 a1 a2 a1 a3 a2 a3 a1 a3 a4 Chow-Liu Algorithm (for Copulas) A1A2 A1A3 A1A4 A2A3 A2A4 A3A4 A1A2 A1A3 A1A4 A2A3 A2A4 A3A4 c(a1,a2) c(a1,a3) c(a1,a4) c(a2,a3) c(a2,a4) c(a3,a4) c(a1,a2) c(a1,a3) c(a1,a4) c(a2,a3) c(a2,a4) c(a3,a4) 0.3126 0.0229 0.0172 0.0230 0.0183 0.2603 0.3126 0.0229 0.0172 0.0230 0.0183 0.2603 Learning with Tree-averaged Densities and Distributions

  18. b12 a1 a2 b13 b12 b12 a1 a1 a2 a2 b14 b23 b13 b13 b24 b14 b14 b23 b23 a4 a3 b34 b24 b24 a4 a4 a3 a3 b34 b34 Distribution over Spanning Trees [Meilă and Jaakkola 00, 06] O(d3) !!! Learning with Tree-averaged Densities and Distributions

  19. Tree-Averaged Copula • Can compute sum over all dd-2 spanning trees • Can be viewed as a mixture over many, many spanning trees • Can use EM to estimate the parameters • Even though there are dd-2 mixture components! Learning with Tree-averaged Densities and Distributions

  20. EM for Tree-Averaged Copulas • E-step: compute • Can be done in O(d3)per data point • M-step: update b and Q • Update of Q isoften linear in the number of points • Gaussian copula: solving cubic equation • Update of b is essentially iterative scaling • Can be done in O(d3) per iteration Intractable!!! Learning with Tree-averaged Densities and Distributions

  21. Experiments: Log-Likelihood on Test Data UCI ML Repository MAGIC data set 12000 10-dimensional vectors 2000 examples in test sets Average over 10 partitions Learning with Tree-averaged Densities and Distributions

  22. Binary-Continuous Data Learning with Tree-averaged Densities and Distributions

  23. Summary • Multivariate distribution = univariate marginals + copula • Copula density estimation via tree-averaging • Closed form • Tractable parameter estimation algorithm in ML framework (EM) • O(Nd3) per iteration • Only bivariate distributions at each estimation • Potentially avoiding the curse of dimensionality • New model for multi-site rainfall amounts (POSTER W12) Learning with Tree-averaged Densities and Distributions

More Related