220 likes | 352 Views
Koby Crammer Hebrew University of Jerusalem. Noam Slonim Princeton University. Bregman Information Bottleneck. NIPS’03, Whistler December 2003. Motivation. Hello, world. Extend the IB for a broad family of representations
E N D
Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University Bregman Information Bottleneck NIPS’03, Whistler December 2003
Motivation Hello, world • Extend the IB for a broad family of representations • Relation to the Exponential family Multinomial distribution Vectors
Outline • Rate-Distortion Formulation • Bregman Divergences • Bregman IB • Statistical Interpretation • Summary
Information Bottleneck X T Y X T [p(y=1|X) … p(y=n|X)] [p(y=1|T) … p(y=n|T)]
Rate-Distortion Formulation • Input • Variables • Distortion
Self-Consistent Equations • Bolzman Distribution: • Markov + Bayes • Marginal
Bregman Divergences f:S R Bf(v||u) = f(v) - (f(u)+f’(u)(v-u)) Bf(v||u) = (v,f(v)) f (u,f(u)) (v, f(u)+f’(u)(v-u))
Bregman IB: Rate-Distortion Formulation • Functional • Bregman Function • Input • Variables • Distortion
Self-Consistent Equations • Bolzman Distribution: • Prototypes: convex combination of input vectors • Marginal
Special Cases • Information Bottleneck: • Bregman function: f(x)=x log(x) – x • Domain:Simplex • Divergence:Kullback-Leibler • Soft K-means • Bregman function:f(x)=(1/2) x2 • Domain:Realsn • Divergence:Euclidian Distance • [Still, Bialek, Bottou, NIPS 2003]
Bregman IB Rate-Distortion Exponential Family Bregman Clustering Information Bottleneck
Exponential Family Expectation parameters: Examples (single dimension): Normal Poisson
Exponential Family and Bregman Divergences • Expectation parameters: • Properties :
Exponential Family and Bregman Divergences • Expectation parameters: • Properties :
Back to Distributional Clustering • Distortion: • Data vectors and prototypes: expectation parameters • Question: For what exponential distribution we have ? Answer: Poisson
Illustration .8 .2 a b 60 40 a b a a b a a a b a a a Product of Poisson Distributions Pr Multinomial Distribution
Back to Distributional Clustering • Information Bottleneck: • Distributional clustering of Poison distributions • (Soft) k-means: • (Soft) Clustering of Normal distributions
Maximum Likelihood Perspective • Distortion • Input: • Observations • Output • Parameters of Distribution • IB functional: EM [Elidan & Fridman, before]
Back to Self Consistent Equations • Posterior: • Partition Function: Weighted b-norm of the Likelihood • b → ∞ , most likely cluster governs • b→0 , clusters collapse into a single prototype
Summary • Bregman Information Bottleneck • Clustering/Compression for many representations and divergences • Statistical Interpretation • Clustering of distributions from the exponential family • EM like formulation • Current Work: • Algorithms • Characterize distortion measures which also yield Bolzman distributions • General distortion measures