140 likes | 320 Views
Demos for QBSH. J.-S. Roger Jang ( 張智星 ) jang@cs.nthu.edu.tw http://mirlab.org/jang CSIE Dept, National Taiwan University. Intro. to QBSH. QBSH: Query by Singing/Humming Challenges Robust pitch tracking Key transposition Collection of song databases Efficient comparison
E N D
Demos for QBSH J.-S. Roger Jang (張智星) jang@cs.nthu.edu.tw http://mirlab.org/jang CSIE Dept, National Taiwan University
Intro. to QBSH • QBSH: Query by Singing/Humming • Challenges • Robust pitch tracking • Key transposition • Collection of song databases • Efficient comparison • Karaoke box: ~10000 songs • Internet: 500M songs, 12M albums (www.jogli.com)
Efficient Retrieval in QBSH • Methods for efficient retrieval • Multi-stage progressive filtering • Indexing for different comparison methods • Music phrase identification • Repeating pattern identification • Distributed & parallel computing • Our focus • Parallel computing via GPU
MIRACLE • MIRACLE • Music Information Retrieval Acoustically via Clustered and paralleLEngines • Database (~13000) • MIDI files • Solo vocals (<100) • Melody extracted from polyphonic music (<100) • Comparison methods • Linear scaling • Dynamic time warping • Top-10 Accuracy • 70~75% • Platform • Single CPU+GPU
MIRACLE (II) • References (full list) • J.-S. Roger Jang and Ming-Yang Gao, "A Query-by-Singing System based on Dynamic Programming", International Workshop on Intelligent Systems Resolutions (the 8th Bellman Continuum), PP. 85-89, Hsinchu, Taiwan, Dec 2000. • Jyh-Shing Roger Jang, Jiang-Chun Chen, Ming-Yang Kao, "MIRACLE: A Music Information Retrieval System with Clustered Computing Engines", International Symposium on Music Information Retrieval (ISMIR) 2001 • … • Chung-Che Wang and Jyh-Shing Roger Jang, “Acceleration of Query by Singing/Humming Systems on GPU: Compare from Anywhere”, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012
Master server Slave Slave Slave MIRACLE Before Oct. 2011 • Client-server distributed computing • Cloud computing via clustered PCs Clustered servers Clients Slave servers Master server PC Request: pitch vector Response: search result PDA/Smartphone Database size: ~12,000 Cellular
Master server Current MIRACLE • Single server with GPU • NVIDIA 560 Ti, 384 cores (speedup factor = 10) Clients Single server Master server PC Request: pitch vector Response: search result PDA/Smartphone Database size: ~13,000 Cellular
Master server Slave Slave Slave MIRACLE in the Future • Multi-modal retrieval • Singing, humming, speech, audio, tapping… Clustered servers Clients Slave servers Master server PC Request: feature vector Response: search result PDA/Smartphone Cellular
QBSH for Various Platforms • PC • Web version • Embedded systems • Karaoke machines • Smartphones • iPhone/Android • Toys • 16-bit micro-controller
QBSH Prototype in MATLAB • To create a QBSH prototype in MATLAB • Get familiar with audio processing in MATLAB • See audio signal processing • Try the programming contests on • Pitch tracking • QBSH • Run exampleProgram/goDemo.m to test drive the QBSH prototype in MATLAB!
QBSH Demos • QBSH demos by our lab • QBSH on the web: MIRACLE • QBSH on toys • Existing commercial QBSH systems • www.midomi.com • www.soundhound.com
Returned Results • Typical results of MIRACLE
Online Karaoke Synchronized lyrics Real-time pitch display Calory consumption Recording Real-time score Live broadcast Automatic key adjustment
Future Work • Multi-modal music retrieval • Query by user’s inputs: Singing, humming, whistling, speech, tapping, beatboxing • Query by exact examples: Audio clips • Speedup schemes • Repeating pattern id., DTW indexing • Database preparation • Polyphonic audio music as database The ultimate challenge!