1 / 6

ESTER Project Balamand work

ESTER Project Balamand work. Rania Bayeh, Chafic Mokbel. Introduction. Participation in two tasks: Detection (SES) Transcription (TRS) Tools used: GMM (Spro + Becars for detection) HMM (HCM for transcription). Detection. Feature Extraction (using Spro):

blaine
Download Presentation

ESTER Project Balamand work

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ESTER ProjectBalamand work Rania Bayeh, Chafic Mokbel

  2. Introduction • Participation in two tasks: • Detection (SES) • Transcription (TRS) • Tools used: • GMM (Spro + Becars for detection) • HMM (HCM for transcription)

  3. Detection • Feature Extraction (using Spro): • 20 MFCC including energy + first order derivative (40 coefficients) • Frame duration 64ms, Frame shift 20ms • 128 Gaussian pdf GMM for (using Becars): • Female Speech, Male speech, Music and Silence • Window based detection (20 frames ~ 450 ms) • A simple time smoothing algorithm: • One window that is detected different than surrounding windows is merged to surrounding windows

  4. Transcription • Acoustic modeling (using HCM a full toolkit): • Feature extraction (using HTK): 13 MFCC + first and second-order derivatives: 39 coefficients • Triphone models: • 3-states models with 32 Gaussian pdfs on each state • Classified using CART algorithm • Trained on ESTER database • Words boundaries models are star type

  5. Transcription • Language modeling: • SRILM to build a bigram • Obtained bigram compiled to fit HCM

  6. Transcription • Comments: • HCM training is real-time • Problems while decoding and high error rate (that is why no results submitted). The problems were: • No silence model included (error in HCM scripts) • Error in the phonemes attributes provided to the CART algorithm: several phonemes are confused (grouped) together • ReTraining is going on and results will be submitted soon

More Related