180 likes | 279 Views
A Musical Data Mining Primer. CS235 – Spring ’03 Dan Berger dberger@cs.ucr.edu. Outline. Motivation/Problem Overview Background Types of Music Digital Representations Psychoacoustics Query (Content vs. Meta-Data) Categorization & Clustering Finding More Conclusion. Motivation.
E N D
A Musical Data Mining Primer CS235 – Spring ’03 Dan Berger dberger@cs.ucr.edu
Outline • Motivation/Problem Overview • Background • Types of Music • Digital Representations • Psychoacoustics • Query (Content vs. Meta-Data) • Categorization & Clustering • Finding More • Conclusion
Motivation • More music is being stored digitally: • PressPlay offers 300,000 tracks for download • As collections grow – organizing and searching manually become hard; • How to find the “right” music in a sea of possibilities? • How to find new artists given current preferences? • How to find a song you heard on the radio?
Problem Overview • Music is a highly dimension time series: • 5 minutes @ CD quality > 13M samples! • It seems logical to apply data mining and IR techniques to this form of information. • Query, Clustering, Prediction, etc. • Application isn’t straightforward for reasons we’ll discuss shortly.
Background: Types of Music • Monophonic: one note sounds at a time. • Homophonic: multiple note sound – all starting (and ending) at the same instant. • Polyphonic: no constraints on concurrency. Most general – and difficult to handle.
Background: Digital Representations • Structured (Symbolic): • MIDI – stores note duration & intensity, instructions for a synthesizer • Unstructured (Sampled): • PCM – stores quantized periodic samples • Leverages Nyquist/Shannon’s sampling thm. to faithfully capture the signal. • MP3/Vorbis/AAC – discards “useless” information – reduces storage and fidelity • Use psychoacoustics • Some work at rediscovering musical structure.
Background: Psychoacoustics • Two main relevant results: • Limited, freq. dependant resolution • Auditory masking • We hear different frequencies differently: • sound spectrum broken into “critical bands” • We “miss” signals due to spectral &/or temporal “collision.” • Loud sounds mask softer ones, • Two sounds of similar frequency get blended
Query – Content is King • Current systems use textual meta-data to facilitate query: • Song/Album Title, Artist, Genre* • The goal is to query by the musical content: • Similarity • ‘find songs “like” the current one’ • ‘find songs “with” this musical phrase’
Result: Query By Humming • A handful of research systems have been built that locate songs in a collection based on the user humming or singing a melodic portion of the song. • Typically search over a collection of monophonic MIDI files.
Content Based Query • Recall: music is a time series with high dimensionality. • Need robust dimensionality reduction. • Not all parts of music are equally important. • Feature extraction – remember the important features. • Which features are important?
Similarity/Feature Extraction • The current “hard problem” – there are ad-hoc solutions, but little supporting theory. • Tempo (bpm), volume, spectral qualities, transitions, etc. • Sound source: is it a piano? a trumpet? • Singer recognition: who’s the vocalist? • Collectively: “Machine Listening” • These are hard problems with some positive results.
Compression Complexity • Different compression schemes (MP3/Vorbis/AAC) use psychoacoustics differently. • Different implementations of a scheme may also! • Feature extraction needs to be robust to these variations. • Seems to be an open problem.
Categorization/Clustering • Genre (rock/r&B/pop/jazz/blues/etc.) is manually assigned – and subjective. • Work is being done on automatic classification and clustering. • Relies on (and sometimes reinvents) the similarity metric work described previously.
Browsing & Visualization: • LOUD: physical exploration • Islands of Music: uses self organizing maps to visualize clusters of similar songs.
Current Efforts • Amazon/iTunes/etc. use collaborative filtering. • If the population is myopic and predictable, it works well, otherwise not. • Hit Song Science – clusters a provided set of songs against a database of top 30 hits to predict success. • Claims to have predicted the success of Nora Jones. • Relatable – musical “fingerprint” technology – involved with “Napster 2”
Finding More • Conferences: • Int. Symposium on Music IR (ISMIR) • Int. Conference on Music and AI (ICMAI) • Joint Conference on Digital Libraries • Journals: • ACM/IEEE Multimedia • Groups: • MIT Media Lab: Machine Listening Group
Conclusion • Slow steady progress is being made. • “Music Appreciation” is fuzzy • we can’t define it but we know it when we hear it. • References, and more detail, are in my survey paper, available shortly on the web. • http://www.cs.ucr.edu/~dberger
Fini Questions?