Birdsong Recognition 鳥類鳴聲辨識

Birdsong Recognition鳥類鳴聲辨識 李建興中華大學資訊工程學系教授

Automatic Classification of Bird Species From Their Sounds Using Two-Dimensional Cepstral CoefficientsChang-Hsing Lee, Chin-Chuan Han, and Ching-Chien ChuangIEEE Trans. on Audio, Speech, and Language Processing, Vol. 16, No. 8, Nov. 2008, pp. 1541-1550.

System Framework Training syllable Test syllable Feature Extraction Feature Extraction PCA PCA Transformation Prototype Vectors Generation LDA LDA Transformation Feature Database Classification Classified Bird Species sc

Feature Extraction Two-dimensional Mel-frequency cepstral coefficient (TDMFCC) MFCC MFCC Time Time DCT TDMFCC

Feature Extraction (cont.) • Dynamic Two-dimensional MFCC ( DTDMFCC )

Prototype Vector Generation • Gaussian mixture model (GMM) vs. Vector quantization (VQ) • Acoustic Model Selection – Bayesian information criterion (BIC) • Component Number Selection – self-splitting Gaussian mixture learning (SGML)

Experimental Results 28 bird species Training set – 3143 syllables Yushan National Park, CD Sound of the Mountain IV: The songs of Wild Birds Yushan National Park, CD Sound of the Mountain V: The songs of Wild Birds Test set – 646 syllables Downloaded from website of National Fonghuanggu Bird Park

Experimental Results (cont.) Comparison of classification results for different PCA threshold 

Experimental Results (cont.) SUMMARIZATION OF CLASSIFICATION ACCURACY (CA), SELECTED MODEL (EVQ OR GMM), THE CLUSTER NUMBER (NS) FOR EACH BIRD SPECIES USING SDTDMFCC WHEN PCA THRESHOLD  = 0.97

Experimental Results (cont.) SUMMARIZATION OF CLASSIFICATION ACCURACY (CA), SELECTED MODEL (EVQ OR GMM), THE CLUSTER NUMBER (NS) FOR EACH BIRD SPECIES USING SDTDMFCC WHEN PCA THRESHOLD  = 0.97 (cont.)

Continuous Birdsong Recognition Using Gaussian Mixture Modeling of Image Shape FeaturesChang-Hsing Lee, Sheng-Bin Hsu, Jau-Ling Shih, and Chih-Hsun ChouIEEE Trans. on Multimedia, Vol. 15, No. 2, Feb. 2013, pp. 454-463.

System Framework

Feature Extraction Angular Radial Transformation (ART) Feature

Feature Extraction (cont.) Music wave form : Zoom in Overlap Frame • Step 1: Spectrogram Generation

Feature Extraction (cont.) • Step 1: Spectrogram Generation (cont.) frequency Spectrum analysis … frame decomposition

Feature Extraction (cont.) • Step 1: Spectrogram Generation (cont.) Waveform Spectrogram

Feature Extraction (cont.) • Step 1: Spectrogram Generation (cont.) 火冠戴菊鳥 (Taiwan Firecest) 白耳畫眉(Taiwan Sibia) 黃腹琉璃(Vivid Niltava) 鳳頭蒼鷹(Crested Goshawk)

Feature Extraction (cont.) • Step 2: Recognition window segmentation

Feature Extraction (cont.) • Step 3: Sector image generation

Feature Extraction (cont.) • Step 3: Sector image generation (cont.)

Feature Extraction (cont.) • Step 4: ART feature extraction • Vn,m(ρ, θ): the ART basis function of order n and m, which is separable along the angular and radial directions: • where

Feature Extraction (cont.) • Step 4: ART feature extraction (cont.) The 1212 (N = 12 and M = 12) complex ART basis functions (a) real parts of ART basis functions (b) imaginary parts of ART basis functions

Feature Extraction (cont.) • Step 4: ART feature extraction (cont.)

Experimental ResultsCOMMON AND LATIN NAME OF BIRD SPECIES IN THE BIRDSONG DATABASE AND THE NUMBER OF BIRDSONG SEGMENTS IN THE TRAINING SET (NTr) AND TEST SET (NTe) FOR BIRDSONG SEGMENTS OF DIFFERENT DURATIONS (D)

Experimental Results (cont.)

Experimental Results (cont.) Comparison of classification accuracy for different number of GMM Gaussian components (G) and distinct PCA thresholds () using 624 ART basis functions for the recognition of birdsong segments having distinct durations (D)

Experimental Results (cont.) Comparison of classification accuracy on distinct ART basis functions (NM) for the classification of birdsong segments having different durations (D) with fixed number of GMM component (G = 5)

Experimental Results (cont.) Comparison of various feature descriptors in terms of classification accuracy (CA)

Thanks!

Birdsong Recognition 鳥類鳴聲辨識

Birdsong Recognition 鳥類鳴聲辨識

Presentation Transcript

Face Recognition and Detection

Face Recognition

Face Recognition

Fingerprint Recognition

Technical Seminar presentation on Speech Recognition using DWT

Revenue Recognition

Chapter 4--Learning Objectives

Voice Recognition

Speech Recognition

Recognition of textures and object classes

Speech Recognition

A Plan For Preserving Birdsong Dennis Lee From: unpublished , 1999.

Recognition, Identification and Names

Eigenfaces for Recognition

Recognition Part II

Odor Recognition

Recognition systems 辨識系統

Member Recognition

Face Recognition

Face Recognition

Pattern Recognition

4-H Recognition