400 likes | 426 Views
Explore Audentify, SyncPlayer, ChoirFish, and CubyHum - cutting-edge music retrieval systems for audio, hum, and more queries. Learn how these systems work and access relevant publications and resources for music information retrieval research.
E N D
Music information retrieval systems Author: Amanda Cohen
Music Information Retrieval Systems • Based on content from the following webpage: http://mirsystems.info/index.php?id=mirsystems • Other good sources on MIR and MIR systems • http://www.music-ir.org - Virtual home of music information retrieval research • http://www.ismir.net - The International Symposium on Music Information Retrieval
Audentify! • Developers: F. Kurth, A. Ribbrock, M. Clausen • Relevant Publications: • Kurth, F., Ribbrock, A., Clausen, M. Identification of Highly Distorted Audio Material for Querying Large Scale Data Bases. 112th Convention of the Audio Engineering Society, May 2002, Munich, Convention Paper • Kurth, F., Ribbrock, A., Clausen, M. Efficient Fault Tolerant Search Techniques for Full-Text Audio Retrieval. 112th Convention of the Audio Engineering Society, May 2002, Munich, Convention Paper • Ribbrock, A. Kurth, F. A Full-Text Retrieval Approach to Content-Based Audio Identification. International Workshop on Multimedia Signal Processing. St. Thomas, US Virgin Islands, December 9-11, 2002 • Kurth, F. A Ranking Technique for fast Audio Identification. International Workshop on Multimedia Signal Processing. St. Thomas, US Virgin Islands, December 9-11, 2002 • Clausen, M., Kurth, F. A Unified Approach to Content-Based and Fault Tolerant Music Recognition, IEEE Transactions on Multimedia. Accepted for publication
Audentify • System Description • Takes signal queries (1-5 seconds, 96-128 kbps) • Searches by audio fingerprint • Returns a file ID that corresponds with a song in the database • Currently a part of the SyncPlayer system
SyncPlayer • Developers: F. Furth, M. Muller, D. Damm, C. Fremerey, A. Ribbrock, M. Clausen • Relevant Publications: • Kurth, F., Müller, M., Damm, D., Fremerey, Ch. Ribbrock, A., Clausen, M. SyncPlayer - An Advanced System for Multimodal Music Access, Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005), London, GB • Kurth, F., Müller, M., Ribbrock, A., Röder, T., Damm, D., Fremerey, Ch. A Prototypical Service for Real-Time Access to Local Context-Based Music Information. Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR 2004), Barcelona, Spain. http://www-mmdb.iai.uni-bonn.de/download/publications/kurth-service-ismir04.pdf • Fremerey, Ch., SyncPlayer - a Framework for Content-Based Music Navigation,Diplomarbeit at the Multimedia Signal Processing Group Prof. Dr. Michael Clausen, University of Bonn, 2006, Bonn, Germany • URL: http://audentify.iai.uni-bonn.de/synchome/index.php?pid=01
SyncPlayer • System Description • Query type(s): audio files (mp3, wav, MIDI), lyrics, MusicXML, Score scans (primary data) • Generates “derived data” from query • extracts features • generates annotations • compiles synchronization data • Submitted to SyncPlayer Server, which can perform three services (at present) • audio identification (through audentify) • provide annotations for a given song • retrieval in lyrics annotation • SyncPlayer Client: audio-visual user interface, allow user to playback, navigate, and search in the primary data
ChoirFish • Developers: A. Van Den Berg, S. Groot • Relevant Publications: • Groot, S., Van Den Berg, A., The Singing Choirfish: An application for Tune Recognition, Proceedings of the 2003 Speech Recognition Seminar, LIACS 2003 • URL: http://www.ookii.org/university/speech/default.aspx
ChoirFish • System Description: • Query by humming • Contour features used for matching • Used Parson’s Code to determine contour • Code is based on the direction of note transitions • 3 characters for each possible direction: • R: The note is the same frequency as the previous note • D: The note is lower in frequency than the previous note • U: The note is higher in frequency than the previous note • Generated by changing the audio to the frequency domain via Fast Fourier Transform and using the highest frequency peak to determine pitch and pitch change
CubyHum • Developers: S. Pauws • Relevant Publications: • Pauws, S., CubyHum: A Fully Operational Query by Humming System, ISMIR 2002 Conference Proceedings (2002): 187--196, doi:10.1.1.108.8515 • PDF of paper: http://ismir2002.ismir.net/proceedings/02-FP06-2.pdf
CubyHum • System Description: • Query by Humming: user queries system by humming the desired song • Pitch is estimated by computing the sum of harmonically compressed spectra (sub-harmonic summation, or SHS). • Musical events (note onsets, gliding tones, inter-onset-intervals) are detected • Query is transformed via quantization into musical score, which is used to create a MIDI melody for auditory feedback • Approximate pattern matching used to find matching song • Distance between melodies defined based on interval sizes and duration ratios to compensate for imperfect query (people don’t always hum the correct melody in the correct key)
Fanimae • Developers: Iman S.H. Suyoto, Alexandra L. Uitdenbogerd and Justin Zobel • Relevant Publications • Suyoto, I.S.H., Uitdenbogerd, A.L., Simple efficient n-gram indexing for effective melody retrieval, Proceedings of the Annual Music Information Retrieval Evaluation eXchange, 2005 • URL: http://mirt.cs.rmit.edu.au/fanimae/
Fanimae • System Description: • Desktop Music Information Retrieval System • Search by symbolic melodic similarity • Query: a melody sequence that contains both pitch and duration information • Melody sequence is standardized • Intervals are encoded as a number of semitones, with large intervals being reduced • Coordinate matching used to detect melodic similarity • Query is split into n-grams of length 5, as are any possible answers • count the common distinct terms between query and possible answer • return results ranked by similarity
Foafing the Music • Developers: Music Technology Group of the Universitat Pompeu Fabra • Relevant Publications: • Celma, O. Ramírez, M. Herrera, P., Foafing the music: A music recommendation system based on RSS feeds and user preferences Proceedings of 6th International Conference on Music Information Retrieval; London, UK, 2005, http://ismir2005.ismir.net/proceedings/3119.pdf • URL: http://foafing-the-music.iua.upf.edu
Foafing the Music • System Description: • Returns personalized music recommendations based on a user’s profile (listening habits, location) • Bases recommendation information on info gathered across the web • Similarity between artists determined by their relationships between one another (ex: influences, followers) • Creates RSS feed for news related to favorite artists • Computes musical similarity between specific songs
Meledex/Greenstone • Developers: McNab, Smith, Bainbridge, Witten • Relevant Publications: • McNab, Smith, Bainbridge, Witten, The New Zealand digital library MELody inDEX, D-Lib Magazine, May 1997 • URL: http://www.nzdl.org/fast-cgi-bin/music/musiclibrary
Meldex/Greenstone • System Description: • Receives audio queries (hummed, sung, or played audio) • Filters audio to get fundamental frequency • Input sent to pitch tracker, which returns average pitch estimate for each 20ms • Note duration can optionally be taken into account, as well as user defined tuning • Results found using approximate string matching based on melodic contour
Musipedia/Melodyhound/Tuneserver • Developer: Rainer Typke • Relevant Publications: • Prechelt, L., Typke, R., An Interface for Melody Input. ACM Transactions on Computer-Human Interaction, June 2001 • URL: http://www.musipedia.org
Musipedia/Melodyhound/Tuneserver • System Description • Query by humming system • Record sound, system converts into sound wave • Converts query sound wave into Parson’s Code • Match by melodic contour • Determine distance between query and possible results via editing distance (calculate the number of modifications necessary to turn one string into the other) • Return results with smallest distance
MIDIZ • Developers: Maria Cláudia Reis Cavalcanti, Marcelo Trannin Machado , Alessandro de Almeida Castro Cerqueira, Nelson SampaioAraujoJúnior and Geraldo Xexéo • Relevant Publications: • Cavalcanti, Maria Cláudia Reis et al. MIDIZ: content based indexing and retrieving MIDI files. J. Braz. Comp. Soc. [online]. 1999, vol. 6, no. 22008-11-02]. http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0104-65001999000300002&lng=&nrm=iso ISSN 0104-6500. doi: 10.1590/S0104-65001999000300002
MIDIZ • System description • Database that stores, indexes, searches for and recovers MIDI files based on the description of a short musical passage • Allows for non-exact queries • Musical sequence is based on intervals between notes • Uses wavelet transform and a sliding window in the melody • Window defines a note sequence of a given size (2^k) and moves through the song note by note • Each sequence in the window is converted into a vector storing the interval distances • First note in a sequence is assigned the value 1 • Values of the following notes are determined by their chromatic distance in relation to the first note • Those values are added together in pairs, and the result is converted into coordinates in the final vector • Songs in database are stored in a BD Tree, determined by Discriminator Zone Expression • Completed vector of query is submitted to tree, similar results are returned
Mu-seek • Developers: Darfforst LLP • URL: http://www.mu-seek.com/ • System Description: • Search by title, lyrics, tune fragment, or MIDI • Uses pitch, contour, and rhythm to find matches
MusicSurfer • Developers: Music Technology Group of the Universitat Pompeu Fabra • Relevant Publications: • Cano et al. An Industrial-Strength Content-based Music Recommendation System, Proceedings of 28th Annual International ACM SIGIR Conference; Salvador, Brazil 2005. http://mtg.upf.edu/files/publications/3ac0d3-SIGIR05-pcano.pdf • URL: http://musicsurfer.iua.upf.edu/
MusicSurfer • System Description: • Automatically extracts features from songs in database based on rhythm, instrumentation, and harmony • Uses spectral analysis to determine timbre • Uses those features to search for similar songs
NameMyTune • Developers: Strongtooth, Inc • URL: http://www.namemytune.com/ • System Description: • User hums query into microphone • Results are found by other users determining what the song is
Orpheus • Developers: Rainer Typke • Relevant Publications: • Typke, Giannopoulos, Veltkamp, Wiering, van Oostrum, Using Transportation Distances for Measuring Melodic Similarity, ISMIR 2003 • URL: http://teuge.labs.cs.uu.nl/Ruu/?id=5
Orpheus • System Description: • Query can be example from database, hummed or whistled melody, or a MIDI file • All queries are converted into internal database format before submission • Similarity between query and results based on Earth Mover’s Distance • Two distributions are represented by signatures • Distance represents the amount of “work” required to change one signature to the other • Work = user defined distance between two signatures
Probabilistic “Name That Song” • Developers: Eric Brochu and Nando de Freitas • Publications: • Brochu, E., Freitas, N.D., "Name That Song!": A Probabilistic Approach to Querying on Music and Text. NIPS. Neural Information Processing Systems: Natural and Synthetic 2002 (2003)
Probabilistic “Name That Song” • System Description: • Query is composed of note transitions (Qm) and words (Qt). A match is found when a corresponding song has all elements of Qm and Qt with a frequency of 1 or greater. • Database songs are clustered. Query is performed on each song in each cluster until a match is found
Query by Humming (Ghias et al.) • Developers: Asif Ghias, Jonathan Logan, David Chamberlin, Brian C. Smith • Relevant Publications: • Ghias, A., Logan, J., Chamberlin, D., Smith, B.C., Query by Humming - Musical Information Retrieval in an Audio Database, ACM Multimedia (1995) • URL: http://www.cs.cornell.edu/Info/Faculty/bsmith/query-by-humming.html
Query by Humming (Ghias et al.) • System Description: • Hummed queries are recorded in Matlab • Pitch tracking is performed • Converted into a string of intervals similar to Parson’s Code (U/D/S used as characters instead of R/D/U) • Baesa-Yates/Perleberg pattern matching algorithm used to find pattern matches • Find all instances of the query string in the result string with at most k mismatches • Results returned in order of how they best fit the query
Search by Humming • Developers: Steven Blackburn • Relevant Publications: • Blackburn, S. G., Content Based Retrieval and Navigation of Music Using Melodic Pitch Contours. PhD Thesis, 2000 • Blackburn, S. G., Content Based Retrieval and Navigation of Music. Masters, 1999 • DeRoure, D., El-Beltagy, S., Blackburn, S. and Hall, W., A Multiagent System for Content Based Navigation of Music. ACM Multimedia 1999 Proceedings Part 2, pages 63-6. • Blackburn, S. G. and DeRoure, D. C., A tool for content based navigation of music. Proceedings of ACM Multimedia 1998, pages 361—368 • DeRoure, D. C. and Blackburn, S. G., Amphion: Open Hypermedia Applied to Temporal Media,Wiil, U. K., Eds. Proceedings of the 4th Open Hypermedia Workshop, 1998, pages 27--32. • DeRoure, D. C., Blackburn, S. G., Oades, L. R., Read, J. N. and Ridgway, N., Applying Open Hypermedia to Audio, Proceedings of ACM Hypertext 1998, pages 285--286. • URL: http://www.beeka.org/research.html
Search by Humming • System Description: • Takes query by humming, example, or MIDI • Queries and database contents represented by gross melodic pitch contour • Within database, each track is stored as a set of overlapping sub-contours of a constant length • Distance between songs is determined by the minimum cost of transforming one contour into another (similar to EMD) • Query is expanded into a set of all possible contours of the same length as the database’s sub-contours • A score is calculated for each file based on the number of times a contour in the expanded query set occurs in the file. Results are sorted in order of score
SOMeJB (The SOM enhanced JukeBox) • Developers: Andreas Rauber, Markus Frühwirth, E. Pampalk, D. Merkl • Relevant Publications: • A. Rauber, E. Pampalk, D. Merkl, The SOM-enhanced JukeBox: Organization and Visualization of Music Collections based on Perceptual Models, Journal of New Music Research (JNMR), Swets and Zeitlinger, 2003 • E. Pampalk, A. Rauber, D. Merkl, Content-based Organization and Visualization of Music Archives In: Proceedings of ACM Multimedia 2002, pp. 570-579, December 1-6, 2002, Juan-les-Pins, France • A. Rauber, E. Pampalk, D. Merkl, Using Psycho-Acoustic Models and Self-Organizing Maps to Create a Hierarchical Structuring of Music by Musical Styles, Proceedings of the 3rd International Symposium on Music Information Retrieval (ISMIR 2002), pp. 71-80, October 13-17, 2002, Paris, France. • A. Rauber, E. Pampalk, D. Merkl, Content-based Music Indexing and Organization, Proceedings of the 25. Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 02), pp. 409-410, August 11-15, 2002, in Tampere, Finland • A. Rauber, and M. Frühwirth, Automatically Analyzing and Organizing Music Archives, Proceedings of the 5. European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2001), Sept. 4-8 2001, Darmstadt • URL: http://www.ifs.tuwien.ac.at/~andi/somejb
SOMeJB (The SOM enhanced JukeBox) • System Description: • Interface is a static, web-based map where similar pieces of music are clustered together • Music organized by a novel set of features based on rhythm patterns in a set of frequency bands and psycho-acoustically motivated transformations • Extracts features that apply to loudness sensation (intensity), and rhythm • Self-organizing map algorithm is applied to organize the pieces on a map (trained neural network)
SoundCompass • Developers: Naoko Kosugi, Yuichi Nishihara, Tetsuo Sakata, Masashi Yamamuro and Kazuhiko Kushima, NTT Laboratories • System Description: • User sets a metronome and hums melody in time with clicks • Database songs have three feature vectors • Tone Transition Feature Vector: contains the dominant pitch for each 16-beat window • Partial Tone Transition Feature Vector: Covers a time window different from the Tone Transition Feature Vector • Tone Distribution Feature Vector: histogram containing note distribution • Query is matched against each of the vectors, results are combined by determining the minimum
Tararira • Developers: Ernesto López, Martin Rocamora, Gonzalo Sosa • Relevant Publications: • E. Lopez y M. Rocamora. Tararira: Sistema de búsqueda de música por melodía cantada. X Brazilian Symposium on Computer Music. October, 2005. • URL: http://iie.fing.edu.uy/investigacion/grupos/gmm/proyectos/tararira/
Tararira • System Description: • User submits a hummed query • Pitch tracking applied to query • Audio segmentation determines note boundaries • Melodic analysis adjusts pitches to tempered scale • Results found by coding query note sequence, find occurrences using flexible similarity rules (string matching), and refining the selection using pitch time series
TreeQ • Developer: Jonathan Foote • Publications: • Foote, J.T., Content-Based Retrieval of Music and Audio, C.-C. J. Kuo et al., editor, Multimedia Storage and Archiving Systems II, Proc. of SPIE, Vol. 3229, pp. 138-147, 1997 • URL: http://sourceforge.net/projects/treeq/
TreeQ • System Description: • Primarily query by example, can also search by classification • Tree based supervised vector quantizer is built from labeled training data • Database audio is parameterized via conversion into MFCC and energy vectors • Each resulting vector is quantized into the tree • Vector space divided into “bins”, any MFCC vector will fall into one bin • A histogram based on the distribution of MFCC vectors into each bin is created for query and database audio • Songs matched based on histograms of feature counts at tree leaves • Distance is determined using Euclidian distance between corresponding templates of each audio clip • Results sorted by magnitude and returned as a ranked list
VisiTunes • Developers: Scott McCaulay • Further Information: www.slis.indiana.edu/research/phd_forum/2006/mccaulay.doc • URL: http://www.naivesoft.com/ • System Description: • Uses audio content of songs to calculate similarity between music and creates playlists based on the results • Converts sample values of each frame to frequency data • Extracts sum total of sound energy by frequency band • Uses results to simplify audio data into 256 integer values for fast comparison