The Problem with Music: Modeling Distance Distributions of Large Music Collections

The Problem with Music:Modeling Distance Distributions of Large Music Collections Prof. Michael Casey Program in Digital Musics Dartmouth College, Hanover, NH Comp. Sci. Colloquium

a.k.a.The Problem with Multimedia:MusicMusic VideosVideosImages

Scalable Similarity • 8M tracks in commercial collection • 6B Images on WWW • Require scalable nearest-neighbor methods • Increase scale, decrease search complexity

Example: Hattogate

Example: Remixing / Sampling in Yahoo! Music • Original Track • Remix 1 • Remix 2 • Remix 3

Example: 3B Images in Flickr

Specificity • Partial document (sub-track) retrieval • Alternate versions: remix, cover, live, album • Task is mid-high specificity

Machine Listening

Feature Extraction

Audio Shingles • Shingles provide contextual information about features • Originally used for Internet search engines: • Andrei Z. Broder, Steven C. Glassman, Mark S. Manasse, Geoffrey Zweig: • “Syntactic Clustering of the Web”.Computer Networks 29(8-13): 1157-1166 (1997) • Related to N-grams, overlapping sequences of features • Applied to audio domain by Casey and Slaney : • Casey, M. Slaney, M. “The Importance of Sequences in Musical Similarity”, in Proc. • IEEE Int. Conf. onAcoustics, Speech and Signal Processing, 2006. ICASSP 2006 , concatenate l frames of m dimensional features A shingle is defined as:

Audio Shingle Similarity

Audio Shingle Similarity For shingles with M dimensions (M=l.m); m=12, 20; l=30,40 , a query shingle drawn from a query track {Q} , database of audio tracks indexed by (n) , a database shingle from track n Shingles are normalized to unit vectors, therefore:

AudioDB: Shingle Nearest Neighbor Search

Whole-track similarity • Often want to know which tracks are similar • Similarity depends on specificity of task • Distortion / filtering / re-encoding (high) • Remix with new audio material (mid) • Cover song: same song, different artist (mid)

Whole-track resemblance:radius-bounded search Compute the number of shingle collisions between two tracks:

Whole-track resemblance:radius-bounded search Compute the number of shingle collisions between two tracks: • Requires a threshold for considering shingles to be related • Need a way to estimate relatedness (threshold) for data set

SCALE • Mazurkas: 10,000 tracks 10-100ms features • 3s clips (30 – 300 frames per vector) • 12d – 20d features (360 – 600d vectors) • Yahoo! Music • 6M tracks • 1000 vectors per track • (6M x 1k)^2 search for near neighbours

LSH

Approximate Near Neighbor Matching

Approximate near neighbors • In many applications we need only near neghbors • We can exploit this by allowing a degree of approximation in retrieval

Space partitioning

Curse of dimensionality d=4 d=8 d=1024 Pr(dist)‏ dist.

Border effects in high d

ε-NN : approximate near neighbors

Setting the range

Hashing • Types of hashes • String : put Bash vs Bush in different bins • Locality sensitive : close matches in same bin • High-dimensional and probabilistic • Nearest Neighbor implementations • Pair-wise distance computation • 1,000,000,000,000 comparisons in 2M song database • Hash bucket collisions • 1,000,000,000 hash projections

Exact matching via hashing • Audio fingerprinting • Shazzam, etc. • Make the feature robust • Use exact matching on integer hash • Find a sequence of hashes to identify specific recording or image • Drawback: only exact matches possible

Locality-Sensitive Hashing (Indyk-Motwani’98)‏ • Hash functions are locality-sensitive, if, for a random hash random function h, for any pair of points p,q we have: • Pr[h(p)=h(q)] is “high” if p is “close” to q • Pr[h(p)=h(q)] is “low” if p is”far” from q

Locality Sensitive Hashing

Random Projections • Random projections estimate distance • Multiple projections improve estimate

h’s are locality-sensitive • Pr[h(p)=h(q)]=(1-D(p,q)/d)k • We can vary the probability by changing k Pr k=1 Pr k=2 distance distance

LSH Random Projections3d to 2d

Statistical approaches to modeling distance distributions

Distribution of minimum distances Database: 1.4 million shingles. The left bump is the minimum between 1000 randomly selected query shingles and this database. The right bump is a small sampling (1/98 000 000) of the full histogram of all distances.

Radius-bounded retrieval performance: cover song (opus task) • Performance depends critically on xthresh, the collision threshold • Want to estimate xthresh automatically from unlabelled data

Order Statistics • Minimum-value distribution is analytic • Estimate the distribution parameters • Substitute into minimum value distribution • Define a threshold in terms of FP rate • This gives an estimate of xthresh

Estimating xthresh from unlabelled data • Use theoretical statistics • Null Hypothesis: • H0: shingles are drawn from unrelated tracks • Assume elements i.i.d., normally distributed • M dimensional shingles, d effective degrees of freedom: • Squared distance distribution for H0

ML for background distribution • Likelihood for N data points (distances squared) • d = effective degrees of freedom • M = shingle dimensionality

Background distribution parameters • Likelihood for N data points (distances squared) • d = effective degrees of freedom • M = shingle dimensionality

Minimum value over N samples

Minimum value distribution of unrelated shingles

Estimate of xthresh , false positive rate

Unlabelled data experiment • Unlabelled data set • Known to contain: • cover songs (same work, different performer) • Near duplicate recordings (misattribution, encoding) • Estimate background distance distribution • Estimate minimum value distribution • Set xthresh so FP rate is <= 1% • Whole-track retrieval based on shingle collisions

Misattributions • Joyce Hatto: 100% of known misattributions in first rank • Sergie Fiorentino • Eleven out of twenty-six Mazurkas performances on another Concert Artists/Fidelio disc, issued under the name of Sergio Fiorentino, are in fact copies of recordings by other artists. This is the first time that such practices have been found in the Concert Artist‘ Fidelio recordings issued other than under the name of Joyce Hatto, and prompts speculation as to how much more misattributed material remains to be found in the Concert Artists/Fidelio catalogue. Click here for further details.

The Problem with Music: Modeling Distance Distributions of Large Music Collections

The Problem with Music: Modeling Distance Distributions of Large Music Collections

Presentation Transcript

Music Therapy: Learning with Music for the Preschool Child

The Music-Culture as a World of Music

Sleeping With Music

Music Collections not on the Online Library Catalogue

Music of the World for the General Music Classroom

ReCAP Collections Analysis: Music Library

Y9 Music Elements of Music

Music of the

Playing with music

Exploring Music Collections on Mobile Devices

Learn Music Together Through the School of Music

From Therapy with Music To Music Therapy

Music Raagtune Music

Enjoy the Independent Label Music with Our Music Group

The Music of

Building World Music Collections: Japan

MFCC for Music Modeling

The Music of

The Music of

The Problem with Music: Modeling Distance Distributions of Large Music Collections

Protect the Legacy of Music with the Suitable Heritage Music Insurance

The Melodies of Indie Music Cafes with Live Music in Delhi