250 likes | 586 Views
A Music Search Engine Built upon Audio-based and Web-based Similarity Measures P. Knees, T., Pohle, M. Schedl, G. Widmer SIGIR 2007 INTRODUCTION
E N D
A Music Search EngineBuilt upon Audio-based andWeb-based Similarity Measures P. Knees, T., Pohle, M. Schedl, G. Widmer SIGIR 2007
INTRODUCTION • Basically all existing music search systems make use of manually assigned subjective meta-information like genre or style to index the underlying music collection. • Explicit manual annotations • A small set of meta-data • Recent approaches • Content-based analysis of the audio files • Collaborative recommendations • Incorporate information from different sources
RELATED WORK • Query-by-example • Query-by-Humming/Singing (QBHS) • Operate on MIDI • Music piece → Meta-data • Cross-media • Semantic ontology • Semantic relations • Crawler on “audio blogs” • Word sense disambiguation • Text surrounding the links to audio files • Last.fm – listening habits & tags
PREPROCESSING THE COLLECTION • ID3 tags • Artist • Album • Title • Ignored • Only speech pieces ( skit in rap) • Intro / Outro • Duration below 1 minute
WEB-BASED FEATURES • Queries to Google • “artist” music • “artist” “album” music review • “artist” “title” music review -lyrics • For each query, retrieve top-ranked 100 pages • Clean HTML tags and stop words in 6 languages
WEB-BASED FEATURES (CONT.) • term list of each music piece • Remove all terms with dftm <= 2 • global term list • Remove all terms that co-occur < 0.1% • Resulting 78,000 terms (dimensions) • weight( t, m ) • tf * idf • N – # of music pieces • mpft – music piece frequency • Cosine normalization • Removes the influence of the length of pages
AUDIO-BASED SIMILARITY • MFCCs, Gaussian Mixture Model, KL divergence • Problem • Hubs - frequently similar • Outliers - never similar to others • Triangle inequality - does not fulfill • Author’s previous work solve these problems
AUDIO-BASED SIMILARITY (CONT.) • Always similar – hubs • ndist(A) = distance to the nth nearest neighbour • g(A, Pi) = Dbasic(A, Pi) / ndist(Pi), for all i • sort g(A, Pi) ascending, pick nth value as f(A) • Dn-NN norm(A, B) = Dbasic(A, B) / ( f(A) * f(B) ) • Never similar – outliers • like above • Triangle inequality • sort Dbasic(A, Pi), for all i • interpolating Dbasic(A, B) into Dbasic(A, Pi) • DP(A, B) is the rank of Dbasic(A, B) in Dbasic(A, Pi) • Dpv(A, B) = DP(A, B) + DP(B, A)
DIMENSIONALITY REDUCTION • χ2 test • s : 100 most similar tracks • d : 100 most dissimilar tracks • Calculate χ2( t, s ) • N terms with highest value are then joined into a global list
VECTOR ADAPTATION • Particularly necessary for tracks where no related information could be retrieved from the web • Perform a simple smoothing
QUERYING THE MUSIC SEARCH ENGINE • Original query + “music” • -site:last.fm • Google search • 10 top-most web pages • Map to vector space • Calculate Euclidean distances
AUDIOSCROBBLER GROUND TRUTH • Common approach • genre information • several drawbacks • http://www.audioscrobbler.net • Web services to access Last.fm data • Tag information provided by Last.fm • drawbacks • Using top tags for tracks (total 227 tags)
PERFORMANCE EVALUATION • Dimensionality reduction χ2 /50 best random permutation pass significance test
PERFORMANCE EVALUATION • Vector adaptation (re-weighting) no significance
PERFORMANCE EVALUATION • Overall • Precision after 10 documents
EXAMPLES Rock with great riffs Punk Relaxing music
FUTURE WORK 12601 tracks ID3 tag Google search Audio similarity Web-based feature Vector adaptation Dimensionality reduction Vector space Query Google search results
FUTURE WORK 12601 tracks ID3 tag 合輯, remix Google search Audio similarity Web-based feature Vector adaptation Dimensionality reduction Vector space Query Google search results
FUTURE WORK 12601 tracks ID3 tag Lyrics Google search Audio similarity Web-based feature Vector adaptation Dimensionality reduction Vector space Query Google search results
FUTURE WORK 12601 tracks ID3 tag Google search Indexing documents Audio similarity Web-based feature Vector adaptation Dimensionality reduction Vector space Query Google search results
FUTURE WORK 12601 tracks ID3 tag Google search Audio similarity Web-based feature PLSA Vector adaptation Dimensionality reduction Vector space Query Google search results
FUTURE WORK 12601 tracks ID3 tag Google search Audio similarity Web-based feature Vector adaptation Dimensionality reduction Vector space Query Google search results Computation inefficient
FUTURE WORK 12601 tracks ID3 tag Google search Audio similarity Web-based feature Vector adaptation Dimensionality reduction Ground truth? Vector space Query Google search results