1 / 23

Information-Theoretic Listening

Information-Theoretic Listening. Paris Smaragdis Machine Listening Group MIT Media Lab. Outline. Defining a global goal for computational audition Example 1: Developing a representation Example 2: Developing grouping functions Conclusions. Auditory Goals.

Download Presentation

Information-Theoretic Listening

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information-Theoretic Listening Paris Smaragdis Machine Listening Group MIT Media Lab

  2. Outline • Defining a global goal for computational audition • Example 1: Developing a representation • Example 2: Developing grouping functions • Conclusions

  3. Auditory Goals • Goals of computational audition are all over the place, should they? • Lack of formal rigor in most theories • Computational listening is fitting psychoacoustic experiment data

  4. Auditory Development • What really made audition? • How did our hearing evolve? • How did our environment shape our hearing? • Can we evolve, rather than instruct, a machine to listen?

  5. Goals of our Sensory System • Distinguish independent events • Object formation • Gestalt grouping • Minimize thinking and effort • Perceive as few objects as possible • Think as little as possible

  6. Entropy Minimization as a Sensory Goal • Long history between entropy and perception • Barlow, Attneave, Attick, Redlich, etc ... • Entropy can measure statistical dependencies • Entropy can measure economy • in both ‘thought’ (algorithmic entropy) • and ‘information’ (Shannon entropy)

  7. What is Entropy? • Shannon Entropy: • A measure of: • Order • Predictability • Information • Correlations • Simplicity • Stability • Redundancy • ... • High entropy = Little order • Low entropy = Lots of order

  8. Representation in Audition • Frequency decompositions • Cochlear hint • Easier to look at data! • Sinusoidal bases • Signal processing framework

  9. Evolving a Representation • Develop a basis decomposition • Bases should be statistically independent • Satisfaction of minimal entropy idea • Decomposition should be data driven • Account for different domains

  10. Method • Use bits of natural sounds to derive bases • Analyze these bits with ICA

  11. Results • We obtain sinusoidal bases! • Transform is driven by the environment • Uniform procedure for different domains

  12. Good Continuation Common AM Common FM Auditory Grouping • Heuristics • Hard to implement on computers • Require even more heuristics to resolve ambiguity • Weak definitions • Bootstrapped to individual domains • Vision Gestalt  Auditory Gestalt  …

  13. Method • Goal: Find grouping that minimizes scene entropy Parameterized Auditory Scene s(t,n) Density Estimation Ps(i) Shannon Entropy Calculation

  14. n = 0.5 Frequency Time Common Modulation - Frequency • Entropy Measurement: • Scene Description:

  15. Common Modulation - Amplitude • Entropy Measurement: • Scene Description: Sine 2 Amplitude n = 0.5 Sine 1 Amplitude Time

  16. n = 0.5 Sine 2 Amplitude Common Modulation - Onset/Offset • Entropy Measurement: • Scene Description: Sine 1 Amplitude Time

  17. Frequency Time Similarity/Proximity - Harmonicity I • Entropy Measurement: • Scene Description:

  18. Frequency Time Similarity/Proximity - Harmonicity II • Entropy Measurement: • Scene Description:

  19. Simple Scene Analysis Example • Simple scene: • 5 Sinusoids • 2 Groups • Simulated Annealing Algorithm • Input: Raw sinusoids • Goal: Entropy minimization • Output: Expected grouping

  20. Important Notes • No definition of time • Developed a concept of frequency • No parameter estimation requirement • Operations on data not parameters • No parameter setting!

  21. Conclusions • Elegant and consistent formulation • No constraint over data representation • Uniform over different domains (Cross-modal!) • No parameter estimation • No parameter tuning! • Biological plausibility • Barlow et al ... • Insight to perception development

  22. Future Work • Good Cost Function? • Joint entropy vs entropy of sums • Shannon entropy vs Kolmogorov complexity • Joint-statistics (cumulants, moments) • Incorporate time • Sounds have time dependencies I’m ignoring • Generalize to include perceptual functions

  23. Teasers • Dissonance and Entropy • Pitch Detection • Instrument Recognition

More Related