230 likes | 241 Views
Explore the intersection of information theory and auditory perception in the context of computational audition with a focus on entropy minimization as a sensory goal. Analyze the evolution of our hearing, the role of environment, and the potential to evolve machines for listening. Discover how entropy measurement can enhance representation in audition and develop groupings for better scene analysis. Delve into modeling auditory scene density, calculating Shannon entropy, and refining frequency and time modulation for sensory processing. Gain insights into future directions and challenges in the field.
E N D
Information-Theoretic Listening Paris Smaragdis Machine Listening Group MIT Media Lab
Outline • Defining a global goal for computational audition • Example 1: Developing a representation • Example 2: Developing grouping functions • Conclusions
Auditory Goals • Goals of computational audition are all over the place, should they? • Lack of formal rigor in most theories • Computational listening is fitting psychoacoustic experiment data
Auditory Development • What really made audition? • How did our hearing evolve? • How did our environment shape our hearing? • Can we evolve, rather than instruct, a machine to listen?
Goals of our Sensory System • Distinguish independent events • Object formation • Gestalt grouping • Minimize thinking and effort • Perceive as few objects as possible • Think as little as possible
Entropy Minimization as a Sensory Goal • Long history between entropy and perception • Barlow, Attneave, Attick, Redlich, etc ... • Entropy can measure statistical dependencies • Entropy can measure economy • in both ‘thought’ (algorithmic entropy) • and ‘information’ (Shannon entropy)
What is Entropy? • Shannon Entropy: • A measure of: • Order • Predictability • Information • Correlations • Simplicity • Stability • Redundancy • ... • High entropy = Little order • Low entropy = Lots of order
Representation in Audition • Frequency decompositions • Cochlear hint • Easier to look at data! • Sinusoidal bases • Signal processing framework
Evolving a Representation • Develop a basis decomposition • Bases should be statistically independent • Satisfaction of minimal entropy idea • Decomposition should be data driven • Account for different domains
Method • Use bits of natural sounds to derive bases • Analyze these bits with ICA
Results • We obtain sinusoidal bases! • Transform is driven by the environment • Uniform procedure for different domains
Good Continuation Common AM Common FM Auditory Grouping • Heuristics • Hard to implement on computers • Require even more heuristics to resolve ambiguity • Weak definitions • Bootstrapped to individual domains • Vision Gestalt Auditory Gestalt …
Method • Goal: Find grouping that minimizes scene entropy Parameterized Auditory Scene s(t,n) Density Estimation Ps(i) Shannon Entropy Calculation
n = 0.5 Frequency Time Common Modulation - Frequency • Entropy Measurement: • Scene Description:
Common Modulation - Amplitude • Entropy Measurement: • Scene Description: Sine 2 Amplitude n = 0.5 Sine 1 Amplitude Time
n = 0.5 Sine 2 Amplitude Common Modulation - Onset/Offset • Entropy Measurement: • Scene Description: Sine 1 Amplitude Time
Frequency Time Similarity/Proximity - Harmonicity I • Entropy Measurement: • Scene Description:
Frequency Time Similarity/Proximity - Harmonicity II • Entropy Measurement: • Scene Description:
Simple Scene Analysis Example • Simple scene: • 5 Sinusoids • 2 Groups • Simulated Annealing Algorithm • Input: Raw sinusoids • Goal: Entropy minimization • Output: Expected grouping
Important Notes • No definition of time • Developed a concept of frequency • No parameter estimation requirement • Operations on data not parameters • No parameter setting!
Conclusions • Elegant and consistent formulation • No constraint over data representation • Uniform over different domains (Cross-modal!) • No parameter estimation • No parameter tuning! • Biological plausibility • Barlow et al ... • Insight to perception development
Future Work • Good Cost Function? • Joint entropy vs entropy of sums • Shannon entropy vs Kolmogorov complexity • Joint-statistics (cumulants, moments) • Incorporate time • Sounds have time dependencies I’m ignoring • Generalize to include perceptual functions
Teasers • Dissonance and Entropy • Pitch Detection • Instrument Recognition