230 likes | 347 Views
Adapted representations of audio signals for music instrument recognition. Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom Paris), France. Summary.
E N D
Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom Paris), France
Summary • Master Thesis: Music instrument recognition on solo performances with signal segmentation (transient part / release part) • Ph. D. Thesis: Structured and sparse decompositions: application to audio indexing Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Music Instrument Recognition • Basic Scheme Training DB (manually indexed) Feature extraction Classification model File to analyze Feature extraction Comparison to the model decision Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Feature Extraction on frames of fixed size (30 ms) Feature Extraction Analysis Frames Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Music Note Scheme Ex: strong attack instrument energy time Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Interest of transients for Music Instrument Recognition piano trumpet flute cello Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Chosen Method • Signal segmentation into transient part / release part • Approximation: fixed length transients • Need of an automatic onset detection algorithm. • Study of solo performances Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Onset Detection • Detection function (ex: high frequency content, spectral difference, phase deviation…) • Peak-picking Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Evaluation of Onset Detection • Necessity of an reference onset database • ROC Curves good detections % false alarms % Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Sound Onset Labelization spectrogram Reference Onset and Sound Databases Signal plot Pierre Leveau - ENST - LAM Sound listening and labels positioning
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Onset Database Annotation precision depending on the file type Detection function evaluation must take it into account Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Annotation precision: examples trumpet cello Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Developed Detection Function Complex Spectral Difference: Delta Complex Spectral Difference: guitar violin Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Detection Function comparison TROC = Topt Tolerance window TROC = 100 ms Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Signal segmentation R R T R T R T T Analysis Frames Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Music Instrument recognition on transients - Results • Music instrument recognition only on transients implies: • big decrease of the learning database size • for a fixed duration of the test signal, less data to take a decision. Results worse than for a recognition on all frames Pierre Leveau - ENST - LAM
Music instrument recognition on solo performances with signal segmentation (transient part / release part) Perspectives • Increase the onset database size for a more robust evaluation • Improve the robustness of the Onset detection algorithm • Merge decisions on transients and steady part, compare to the classical static recognition. • Select features adapted for each part of the notes. Pierre Leveau - ENST - LAM
Ph. D. Thesis Subject: Sparse and structured decompositions: application to audio indexing Under supervision of Gaël Richard (GET - ENST, Paris) and Laurent Daudet (Laboratoire d’Acoustique Musicale, Paris) Pierre Leveau - ENST - LAM
Sparse and structured decompositions: application to audio indexing SparseRepresentations • Classical representations: Orthogonal transform (ex: Fourier Transform, STFT, MDCT, Wavelet Transform…) • Redundant representations: : Redundant dictionnary Sparse representations (only on N terms): Pierre Leveau - ENST - LAM
Sparse and structured decompositions: application to audio indexing Dictionary Example C: MDCT basis (useful to represent tonal parts of signals) W: DWT basis (useful to represent transient parts of signals) Pierre Leveau - ENST - LAM
Sparse and structured decompositions: application to audio indexing Algorithms • Matching Pursuit (and its variants): • Greedy algorithms • Based on an iterative search • Faster algorithm needs a suboptimal search • Molecular Matching Pursuit: • Gives structured, perceptually relevant organizations of the atoms (by grouping significant coefficients) • Faster than standard MP • Fast varying frequencies (ex: vibrato) cannot be efficiently represented Pierre Leveau - ENST - LAM
Sparse and structured decompositions: application to audio indexing Application to music instrument recognition Classical Music Instrument Recognition Feature Extraction Comparison to statistical models Signal features Decision Music Instrument Recognition with sparse decomposition MMP Feature Extraction (which features?) Comparison to statistical models (which models?) Signal features Decision Structured Representation Pierre Leveau - ENST - LAM
To be continued… Thank you for your attention. Pierre Leveau - ENST - LAM