290 likes | 373 Views
Music Information Retrieval With Condor. Scott McCaulay Joe Rinkovsky Pervasive Technology Institute Indiana University. Overview. PFASC is a suite of applications developed at IU to perform automated similarity analysis of audio files
E N D
Music Information Retrieval With Condor Scott McCaulayJoe Rinkovsky Pervasive Technology Institute Indiana University
Overview • PFASC is a suite of applications developed at IU to perform automated similarity analysis of audio files • Potential applications include organization of digital libraries, recommender systems, playlist generators, audio processing • PFASC is a project in the MIR field, an extension and adaptation of traditional Text Information Retrieval techniques to sound files • Elements of PFASC, specifically the file by file similarity calculation, have proven to be a very good fit with Condor
What We’ll Cover • Condor at Indiana University • Background on Information Retrieval and Music Information Retrieval • The PFASC project • PFASC and Condor, experience to date and results • Summary
Condor at IU • Initiated in 2003 • Utilizes 2350 Windows Vista machines from IU’s Student Technology Clusters • Minimum 2GB memory, 100 Mb network • Available to students at 42 locations on the Bloomington campus 24 x 7 • Student use is top priority, Condor jobs are suspended immediately on use
Costs to Support Condor at IU • Annual marginal annual cost to support Condor Pool at IU is < $15K • Includes system administration, head nodes, file servers • Purchase and support of STC machines are funded from Student Technology Fees
Challenges to Making Good use of Condor Resources at IU • Windows environment • Research computing environment at IU is geared to Linux, or to exotic architectures • Ephemeral resources • Machines are moderately to heavily used at all hours, longer jobs are likely to be preempted • Availability of other computing resources • Local users are far from starved for cycles, limited motivation to port
Examples of Applications Supported on Condor at IU • Hydra Portal (2003) • Job submission portal • Suite of Bio apps, Blast, Meme, FastDNAml • Condor Render Portal (2006) • Maya, Blender video rendering • PFASC (2008) • Similarity analysis of audio files
Information Retrieval - Background • Science of organizing documents for search and retrieval • Dates back to 1880s (Hollerith) • Vannevar Bush, first US presidential science advisor, presages hypertext in “As We May Think” (1945) • The concept of automated text document analysis, organization and retrieval was met with a good deal of skepticism until the 1990s. Some critics now grudgingly concede that it might work
Calculating SimilarityThe Vector Space Model • Each feature found in a file is assigned a weight based on the frequency of its occurrence in the file and how common that feature is in the collection • Similarity between files is calculated based on common features and their weights. If two files share features not common to the entire collection, their similarity value will be very high • This vector space model (Salton) is the basis of many text search engines, and also works well with audio files • For text files, features are words or character strings. For Audio files, features are prominent frequencies within frames of audio or sequences of frequencies across frames.
Some Digital Audio History • Uploaded to Compuserve 10/1985 • one of the most popular downloads at the time! • 10 seconds of digital audio • Time to download (300 baud): 20 minutes • Time to load: 20 minutes (tape) 2 minutes (disk) • Storage space: 42K • From this to Napster in less than 15 years
Explosion of Digital Audio • Digital audio today similar to text 15 years ago • Poised for 2nd phase of the digital audio revolution? • Ubiquitous, easy to create, access, share • Lack of tools to analyze, search or organize
How can we organize this enormous and growing volume of digital audio data for discovery and retrieval?
What’s done today • Pandora - Music Genome Project • expert manual classification of ~ 400 attributes • Allmusic • manual artist similarity classification by critics • last.fm – Audioscrobbler • collaborative filtering from user playlists • iTunes Genius • collaborative filtering from user playlists
What’s NOT done today • Any analysis (outside of research) of similarity or classification based on the actual audio content of song files
Possible Hybrid Solution Automated Analysis • Classification/Retrieval system could use elements of all three methods to improve performance User Behavior Manual Metadata
Music Information Retrieval • Applying traditional IR techniques for classification, clustering, similarity analysis, pattern matching, etc. to digital audio files • Recent field of study, has accelerated with the inception of the ISMIR conference in 2000 and MIREX evaluation in 2004.
Common Basis of an MIR System • Select very small segment of audio data, 20-40ms • Use fast Fourier transform (FFT) to convert to frequency data • This ‘frame’ of audio becomes the equivalent of a word in a text file for similarity analysis • The output of this ‘feature extraction’ process is input to various analysis or classification processes • PFASC additionally combines prominent frequencies from adjacent frames to create temporal sequences as features
PFASC as an MIR Project • Parallel Framework for Audio Similarity Clustering • Initiated at IU in 2008 • Team includes School of Library and Information Science (SLIS), Cognitive Science, School of Music and Pervasive Technologies Institute (PTI) • Have developed MPI-based feature extraction algorithm, SVM classification, vector space similarity analysis, some preliminary visualization. • Wish list includes graphical workflow, job submission portal, use in MIR classes
PFASC Philosophy and Methodology • Provide an end-to-end framework for MIR, from workflow to visualization • Recognize temporal context as an critical element of audio and a necessary part of feature extraction • Simple concept, simple implementation, one highly configurable algorithm for feature extraction • Dynamic combination and tuning of results from multiple runs, user controlled weighting • Make good use of available cyberinfrastructure • Support education in MIR
PFASC Feature Extraction Example • Summary of 450 files classified by genre, showing most prominent frequencies across spectrum
PFASC Similarity Matrix Example • Audio file summarized as a vector of feature values, similarity calculated between vectors • Value is between 0.0 and 1.0, 0.0 = no commonality, 1.0 = files are identical • In the above example, same genre files had similarity scores 3.352 times higher than different genre files
Classification vs. Clustering • Most work in MIR involves classification, e.g. genre classification, an exercise that may be arbitrary and limited in value • Calculating similarity values among all songs in a library may be more practical for music discovery, playlist generation, grouping by combinations of selected features • Calculating similarity is MUCH more computationally intensive than categorization, comparing all songs in a library of 20,000 files requires ~200 million comparisons
Using Condor for Similarity Analysis • Good fit for IU Condor resources, a very large number of short duration jobs • Jobs are independent, can be restarted and run in any order • Large number of available machines provides great wall clock performance advantage over IU supercomputers
PFASC Performance and Resources • A recent run of 450 jobs completed in 16 minutes. Time to run in serial on a desktop machine would have been about 19 hours • Largest run to date contained 3,245 files, over 5 million song-to-song comparisons, completed in less than eight hours, would have been over 11 days on a desktop • Queue wait time for 450 processors on IU’s Big Red is typically several days, for 3000+ processors it would be up to a month
PFASC Contributors • Scott McCaulay (Project Lead) • Ray Sheppard (MPI Programming) • Eric Wernert (Visualization) • Joe Rinkovsky (Condor) • Steve Simms (Storage & Workflow) • Kiduk Yang (Information Retrieval) • John Walsh (Digital Libraries) • Eric Isaacson (Music Cognition)