1 / 27

ICA and ISA Using Schweizer-Wolff Measure of Dependence

ICA and ISA Using Schweizer-Wolff Measure of Dependence. Sergey Kirshner and Barnabás Póczos Alberta Ingenuity Centre for Machine Learning, Department of Computing Science, University of Alberta, Canada. ICML 2008 Poster Session II (Paper #551). July 8, 2008.

Download Presentation

ICA and ISA Using Schweizer-Wolff Measure of Dependence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ICA and ISA Using Schweizer-Wolff Measure of Dependence Sergey Kirshner and Barnabás Póczos Alberta Ingenuity Centre for Machine Learning, Department of Computing Science, University of Alberta, Canada ICML 2008 Poster Session II (Paper #551) July 8, 2008

  2. Independent Component Analysis(ICA, The Cocktail Party Problem) Sources Mixing Observation Estimation 1 2 3 d y = Wx s x = As ICML 2008

  3. Find W, recover Wx Observation Sources Estimation Independent Subspace Analysis(ISA, The Woodstock Problem) ICML 2008

  4. Solving ICA • Sources can be recovered only up to scale and permutation • ICA solution: • Removing mean, E[y]=0 • Whitening, E[yyT]=I • Finding a multidimensional rotation optimizing an objective function • Sequence of 2-d Jacobi (Givens) rotations original mixed whitened rotated ICML 2008

  5. Objective Function/Contrast • Need to measure independence (ICA) • Mutual information (MI) estimation • Kernel-ICA [Bach & Jordan, 2002] • Entropy estimation • RADICAL [Learned-Miller & Fisher, 2003] • FastICA [Hyvarinen, 1999] • ML estimation • KDICA [Chen, 2006] • Moment-based methods • JADE [Cardoso, 1993] • Correlation-based methods • [Jutten and Herault, 1991] • Need to measure dependence (ISA) • Grouping signals recovered by ICA • Cardoso’s conjecture [Cardoso, 1998] ICML 2008

  6. Contribution: New ICA Contrast • Using a measure of dependence based on copulas, joint distributions over ranks • Properties • Does not require density estimation • Non-parametric • Advantages • Very robust to outliers • Frequently performs as well as state of the art algorithms (and sometimes better) • Contrast can be used with ISA • Can be used to estimate dependence • Easy to implement (code publicly available) • Disadvantages • Somewhat slow (although not prohibitively so) • Needs more samples to demix near-Gaussian sources ICML 2008

  7. Ranks ICML 2008

  8. Ranks • Invariant under monotonic transformations • Implied non-linearity • Not very sensitive to outliers ICML 2008

  9. Probability Integral Transform ICML 2008

  10. Distribution Over Ranks ICML 2008

  11. Copula Bivariate copula C is a multivariate distribution (cdf) defined on a unit square with uniform univariate marginals: ICML 2008

  12. Sklar’s Theorem [Sklar, 59] = + ICML 2008

  13. Useful Properties of Copulas • Preserve rank correlations between the variables • Preserve mutual information • Can be viewed as a canonical form of a multivariate distribution for the purpose of the estimation of multivariate dependence ICML 2008

  14. Independence Copula ICML 2008

  15. Schweizer-Wolff Measures of Dependence • L-norm between the copula for the distribution and the independence copula • Range is [0,1] • 0 if and only if variables are independent • Examples • s (L1-norm) • k (L∞-norm) [Schweizer and Wolff, 81] ICML 2008

  16. Empirical Copula [Deheuvels,79] ICML 2008

  17. Using Empirical Copula • Useful for computing sample versions of functions on copulas ICML 2008

  18. Schweizer-Wolff ICA (SWICA) • Inputs:X, a 2 x N matrix of signals, K,number of evaluation angles • For q=0,p/(2K),…,(K-1)p/(2K) • Compute rotation matrix • Compute rotated signals Y(q)=W(q)X. • Compute s(Y(q)), a sample estimate of s • Sort • Compute s • Find best angle qm=argminqs(Y(q)) • Output: Rotation matrix W=W(qm), demixed signal Y=Y(qm), and estimated dependence measure s=s(Y(qm)) O(1) O(N) O(NlogN) O(NlogB) • Sort into B bins O(N+B2) O(N2) • Compute B-bin approximation to s O(K) O(KNlogB) O(KN2) ICML 2008

  19. demixing mixing Amari Error • Measures how close a square matrix is to a permutation matrix B = WA ICML 2008

  20. Synthetic Marginal Distributions [Bach and Jordan 02] ICML 2008

  21. ICA Comparison (d=2, N=1000, 1000 repetitions) ICML 2008

  22. Outliers(d=2, N=1000, 10000 repetitions) ICML 2008

  23. Outliers: Multidimensional Comparison d=8, N=5000, 100 repetitions d=4, N=2000, 1000 repetitions Amari error Number of post-whitening outliers ICML 2008

  24. Unmixing Image Sources with Outliers (3% outliers) Original SWICA FastICA Mixed [http://www.cis.hut.fi/projects/ica/data/images/] ICML 2008

  25. Independent Subspace Analysis ICML 2008

  26. Summary • New contrast based on a measure of dependence for distribution over ranks • Robust to outliers • Comparable performance to state of the art algorithms • Can handle a moderate number of sources (d=20) • Can be used to solve ISA • Future work • Further acceleration of SWICA • What types of sources does SWICA do well on and why? • Non-parametric estimation of mutual information ICML 2008

  27. Software http://www.cs.ualberta.ca/~sergey/SWICA ICML 2008

More Related