540 likes | 668 Views
LAM: Musical Audio Similarity. Michael Casey Centre for Cognition, Computation and Culture Department of Computing Goldsmiths College, University of London. Overview. Machine Music Understanding Features / Classes / Clusters Real-Time Audio Matching Feature Extraction
E N D
LAM: Musical Audio Similarity Michael Casey Centre for Cognition, Computation and Culture Department of Computing Goldsmiths College, University of London
Overview • Machine Music Understanding • Features / Classes / Clusters • Real-Time Audio Matching • Feature Extraction • Feature Similarity (Indexing / Retrieval) • PD/MSP Tools • Music Similarity Applications • Sound object matching • Texture matching
Sound Understanding Signal Processing Sound Understanding
Statistical Learningfor Decision Making Partitioning of feature space p( | ) * P( ) P( | )= p( ) Decision boundary Music Speech
MPEG-7 Audio Tools Audio
MPEG-7 Audio Tools Log Frequency Spectrogram Audio AudioSpectrumEnvelopeD
MPEG-7 Audio Tools Decorrelating Transform / Dimension Reduction Log Frequency Spectrogram Log Amplitude Audio AudioSpectrumEnvelopeD AudioSpectrumProjectionD
SoundModelStatePathD Use estimated state sequence as a feature State Path
MPEG-7 Audio Tools Decorrelating Transform / Dimension Reduction Log Frequency Spectrogram Hidden Markov Model Log Amplitude Audio AudioSpectrumEnvelopeD SoundModelDS AudioSpectrumProjectionD
MPEG-7 Audio StringsAcoustic Lexicons Decorrelating Transform / Dimension Reduction Log Frequency Spectrogram Hidden Markov Model Log Amplitude Audio AudioSpectrumEnvelopeD SoundModelDS State Path AudioSpectrumProjectionD SoundModelStatePathD ? 7 1 V 7 1 0 1 ... SYMBOL STRING
State Symbol Sequence (40 State Model) ?71V7101 ...
State Symbol Sequence (40 State Model) ?71V7101 ...
State Symbol Sequence (40 State Model) ?71V7101 ...
State Symbol Sequence (40 State Model) ?71V7101 ...
SoundModelStateHistogramD state index 0.01s Frames state index seconds
Efficient Storage / Retrieval • Real-Time Access • Large Databases • Distributed Databases
PostgreSQL Database Representation of State Path “Strings” and Histograms
Similarity • Compute distance between feature pairs • Features == SoundModelStateHistogramD • Similarity Metric • dist(a,b) >= 0 • dist(a,b)== 0 iff a==b • dist(a,b) + dist(b,c) >= dist(a,c) • Vector Dot Product
Acousticon Strings • Distance Metric • String Edit Distance (Levenschtein) • Scalable to Large Databases • PostgreSQL Implementation • Can use built-in Index Structures • Scalable to Real-Time Implementation • matching and audio streaming (< 20ms )
Information Retrievalfor Creativity • Utilize sound extant database for new material • Take the structure of a music clip but replace the content. • New interfaces for music creativity.
Audio Information Retrieval MPEG-7 Database A pre-indexed Collection of Sounds
Audio Information Retrieval MPEG-7 Database Extract Segment Match Audio Query A Sound or Scene or List of Sounds Result List
Audio Information Retrieval MPEG-7 Database Extract Segment Match Audio Query Feature extraction from audio. Result List
Audio Information Retrieval MPEG-7 Database Extract Segment Match Audio Query Partitioning of audio into chunks. Result List
Audio Information Retrieval MPEG-7 Database Extract Segment Match Audio Query Result List Find similar chunks of Audio
Real-Time Matching Musaics
Real-Time Matching Real-Time Matching Musaics
Real-Time Matching Musaics
Real-Time Matching Musaics
Real-Time Matching Musaics
Real-Time Matching Musaics
Real-Time Matching Musaics