1 / 4

ASAT Project

ASAT Project. Two main research thrusts Feature extraction Evidence combiner Feature extraction The classical distinctive features are well explored, but not solved.

ellard
Download Presentation

ASAT Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ASAT Project • Two main research thrusts • Feature extraction • Evidence combiner • Feature extraction • The classical distinctive features are well explored, but not solved. • Many other waveform features and events can be extracted – reflecting time properties, spectral properties, various vocal tract model parameters, glottal features, prosodic events and combinations thereof. • Features may at first glace have little relevance to articulatory gestures (modulation products, etc.) • Successful feature sets can then be subject to perceptual interpretation. • This approach was successfully implemented in a thesis by Necioglu for speaker characterization

  2. ASAT Project • Feature extraction (cont’d) • Statistical characterizations that extract recurrent patterns can be the basis for such features • One example useful for ultra-low-bit-rate coding: Ergodic HMMs that are not phonetically based but are useful for pattern extraction. • Take advantage of segmentation event detectors used in the latest speech coders (despite dogma, the problem and ASR and speech coding cannot be completely orthogonal!) • Robust feature extraction should have confidence measures included • First steps: build a toolbox of feature extraction modules.

  3. ASAT Project • Evidence Combining / Fusion • Events will never be perfectly detected. • Phonetic/sub-word features are never going to be perfectly extracted. • Features can be fuzzy (e.g., nasalization has degrees) • Reliability is affected by speaking style, the channel, the length of the event. • “Error bars” can be extremely wide • Common framework: seek to represent confidence measures as probabilities for straightforward combinations. Do not apply thresholding. • This will require each detected event and each high order feature detected to have individual non-linear normalizations trained to before overall combination.

  4. ASAT Project • Evidence Combining / Fusion (cont’d) • This will require each detected event and each high order feature to have individual non-linear normalizations trained before overall combination. • Some level of brute force will be required to estimate these normalizations for new contributors. • Will begin with simple detectors to verify approach • Will study alternate approaches as reported.

More Related