1 / 14

Demos for QBSH

Demos for QBSH. J.-S. Roger Jang ( 張智星 ) jang@cs.nthu.edu.tw http://mirlab.org/jang CSIE Dept, National Taiwan University. Intro. to QBSH. QBSH: Query by Singing/Humming Challenges Robust pitch tracking Key transposition Collection of song databases Efficient comparison

africa
Download Presentation

Demos for QBSH

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Demos for QBSH J.-S. Roger Jang (張智星) jang@cs.nthu.edu.tw http://mirlab.org/jang CSIE Dept, National Taiwan University

  2. Intro. to QBSH • QBSH: Query by Singing/Humming • Challenges • Robust pitch tracking • Key transposition • Collection of song databases • Efficient comparison • Karaoke box: ~10000 songs • Internet: 500M songs, 12M albums (www.jogli.com)

  3. Efficient Retrieval in QBSH • Methods for efficient retrieval • Multi-stage progressive filtering • Indexing for different comparison methods • Music phrase identification • Repeating pattern identification • Distributed & parallel computing • Our focus • Parallel computing via GPU

  4. MIRACLE • MIRACLE • Music Information Retrieval Acoustically via Clustered and paralleLEngines • Database (~13000) • MIDI files • Solo vocals (<100) • Melody extracted from polyphonic music (<100) • Comparison methods • Linear scaling • Dynamic time warping • Top-10 Accuracy • 70~75% • Platform • Single CPU+GPU

  5. MIRACLE (II) • References (full list) • J.-S. Roger Jang and Ming-Yang Gao, "A Query-by-Singing System based on Dynamic Programming", International Workshop on Intelligent Systems Resolutions (the 8th Bellman Continuum), PP. 85-89, Hsinchu, Taiwan, Dec 2000. • Jyh-Shing Roger Jang, Jiang-Chun Chen, Ming-Yang Kao, "MIRACLE: A Music Information Retrieval System with Clustered Computing Engines", International Symposium on Music Information Retrieval (ISMIR) 2001 • … • Chung-Che Wang and Jyh-Shing Roger Jang, “Acceleration of Query by Singing/Humming Systems on GPU: Compare from Anywhere”, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012

  6. Master server Slave Slave Slave MIRACLE Before Oct. 2011 • Client-server distributed computing • Cloud computing via clustered PCs Clustered servers Clients Slave servers Master server PC Request: pitch vector Response: search result PDA/Smartphone Database size: ~12,000 Cellular

  7. Master server Current MIRACLE • Single server with GPU • NVIDIA 560 Ti, 384 cores (speedup factor = 10) Clients Single server Master server PC Request: pitch vector Response: search result PDA/Smartphone Database size: ~13,000 Cellular

  8. Master server Slave Slave Slave MIRACLE in the Future • Multi-modal retrieval • Singing, humming, speech, audio, tapping… Clustered servers Clients Slave servers Master server PC Request: feature vector Response: search result PDA/Smartphone Cellular

  9. QBSH for Various Platforms • PC • Web version • Embedded systems • Karaoke machines • Smartphones • iPhone/Android • Toys • 16-bit micro-controller

  10. QBSH Prototype in MATLAB • To create a QBSH prototype in MATLAB • Get familiar with audio processing in MATLAB • See audio signal processing • Try the programming contests on • Pitch tracking • QBSH • Run exampleProgram/goDemo.m to test drive the QBSH prototype in MATLAB!

  11. QBSH Demos • QBSH demos by our lab • QBSH on the web: MIRACLE • QBSH on toys • Existing commercial QBSH systems • www.midomi.com • www.soundhound.com

  12. Returned Results • Typical results of MIRACLE

  13. Online Karaoke Synchronized lyrics Real-time pitch display Calory consumption Recording Real-time score Live broadcast Automatic key adjustment

  14. Future Work • Multi-modal music retrieval • Query by user’s inputs: Singing, humming, whistling, speech, tapping, beatboxing • Query by exact examples: Audio clips • Speedup schemes • Repeating pattern id., DTW indexing • Database preparation • Polyphonic audio music as database  The ultimate challenge!

More Related