A MISSING FEATURE APPROACH TO INSTRUMENT IDENTIFICATION IN POLYPHONIC MUSIC

A MISSING FEATURE APPROACH TO INSTRUMENT IDENTIFICATION IN POLYPHONIC MUSIC Jana Eggink and Guy J. Brown University of Sheffield

Automatic Music Transcription • input: audio recording• output: score or other symbolic representation • needed (for every note): • pitch• start and duration • instrument• extras: key (C major), meter (4/4), bars, loudness, expression... • useful for: • musicologists• musicians• music information retrieval

Instrument Identification possible clues: • method of excitation (hitting, blowing, plucked or bowed strings) causes:• noise during onset• delayed begin of individual partials during onset• spectral fluctuations during steady state • resonance properties of the instrument body mostly effect the steady state:• energy distribution among high and low partials• formant regions• spectral bandwidth

Example Spectrograms oboe cello

Human Instrument Identification • different clues from onset and steady state are used, individual clues like e.g. static spectrum can be enough to identify some, but not all instruments • onset seems most relevant for instrument family discrimination • better performance on musical phrases than on single tones • experts are better than non-experts

Computer Instrument Identification JC Brown et al. (2001): • GMM classifier • frame based cepstral coefficients • 4 woodwinds (flute, clarinet, oboe, saxophone) • realistic, monophonic phrases • computer:60% correct average80% best parameter choice • humans: 85% KD Martin (1999): • hierarchical classification scheme • different features, both temporal and spectral • 27 different instruments • realistic, monophonic phrases and single notes • computer:48% instrument correct75% instrument family • humans: 57% instrument correct95% instrument family

Polyphonic Kashino & Murase (1999) • time domain approach • example waveforms stored for each note of each instrument • best match found using adaptive filtering techniques • iterative subtraction scheme • 3 instruments: flute, violin, piano • specially made recording • F0s and onset times supplied • 68% correct (max. polyphony 3) Kinoshita et al. (1999) • frequency domain approach • features measuring temporal variation at the onset, and spectral energy distribution • colliding partials are identified and • corresponding feature values are (mostly) ignored • 3 instruments: clarinet, violin, piano • random chord combinations made from 2 isolated tones • 70% correct (78% if correct F0s were supplied)

Our System • missing feature approach – works for speech recognition in the presence of noise • GMMs trained with spectral features perform well for realistic monophonic music and • GMMs have also been used in combination with a missing feature approach for speaker identification in noise use a GMM classifier in combination with a missing feature approach for instrument recognition in realistic, polyphonic music

System Overview

F0-analysis • iterative approach based on harmonic sieves (Scheffers, 1983) bad fitting sieve best fitting sieve determines F0

Missing Feature Estimation • finding reliable and unreliable features is one of the main problems • instrument tones have an approximately harmonic overtone series • based on the extracted F0s, all frequency regions where a partial from a non-target tone is found are marked as unreliable and excluded from the recognition process

Features • local spectral features are required for missing feature • frame based (exact onset detection is hard in polyphonic music) • energy in narrow frequency bands (60 Hz) • linear spacing, corresponding to linear spacing of partials

Example Features with Mask target tone (violin D) non-target tone(oboe G sharp) mixture target tone + mask non-target tone + mask mixture + mask

GMMs • approximate a distribution by a combination of individual gaussians example of a 2-dimensional distribution modeled by a GMM consisting of 3 individual Gaussians • means and covariances trained by EM-algorithm

GMMs with Missing Features probability density function (pdf) of observed spectral D-dimensional feature vector x is modeled as: assuming feature independence, this can be rewritten as: approximating the pdf from reliable data only leads to: N = number of Gaussians in the mixture model, pi = mixture weight, Fi = univariate Gaussians with mi = mean vector, mij = mean, Si = covariance matrix, s2ij = standard deviation, M’ = subset of reliable features in Mask M

Results Monophonic • GMMs trained for 5 instruments: flute, clarinet, oboe, violin, cello • realistic monophonic phrases (3-4 per instrument) 83% correct • single notes: 66% instrument correct, 85% instrument family correct

Random 2-tone Chords • correct F0 were provided • 49% instrument correct, 72% instrument family

Realistic Duet Recording • duet for flute and clarinet by H. Villa-Lobos• F0s extracted by the system system output: original score: flute clarinet in A fundamental frequency (Hz) F0s according to the score in Hz:415 - 415 - 415 - 622 - 622208 - 185 - 175 - 277 - 294 - 247 - 220 - 208 time (frames)

Conclusions • looks promising for small ensembles• works with realistic stimuli Future Work • include temporal information• idea: one HMM for every instrument tone• missing feature approach comparable to the one used or • spectral subtraction based on templates

A MISSING FEATURE APPROACH TO INSTRUMENT IDENTIFICATION IN POLYPHONIC MUSIC

A MISSING FEATURE APPROACH TO INSTRUMENT IDENTIFICATION IN POLYPHONIC MUSIC

Presentation Transcript

Traditional Chinese music instrument

Automated Transcription of Polyphonic Piano Music A Brief Literature Review

Feature Identification

Instrument identification and line symbols

Multiscale Feature Identification in the Solar Atmosphere

Instrument Approach Charts

Instrument Identification in Polyphonic Audio

Instrument Identification Assessment

A global approach to ELT instrument developments

Indonesia’s Traditional Music Instrument

Instrument Classification in a Polyphonic Music Environment

INSTRUMENT IDENTIFICATION IN POLYPHONIC MUSIC

Protein Feature Identification

Tree structured representation of music for polyphonic music information retrieval

Polyphonic JR

Workshop 1: A Practical Approach to Aeroallergen Identification

Report about polyphonic music transcription

An approach to narratology in music analysis

Missing Number Identification

A Digital Technique for Music Identification

Extracting Individual Tracks from Polyphonic Music

Extraction of Individual Tracks from Polyphonic Music