200 likes | 318 Views
Covariation and weighting of harmonically decomposed streams for ASR. Introduction Pitch-scaled harmonic filter Recognition experiments Results Conclusion. Production of /z/:. periodic. aperiodic. Motivation and aims.
E N D
Covariation and weighting of harmonically decomposed streams for ASR • Introduction • Pitch-scaled harmonic filter • Recognition experiments • Results • Conclusion Production of /z/: periodic aperiodic
Motivation and aims • Most speech sounds are either voiced or unvoiced, which have very different properties: • voiced: quasi-periodic signal from phonation • unvoiced: aperiodic signal from turbulence noise • Do these properties allow humans to recognize speech in noise? Maybe, we can use this information to help ASR... by computing separate features for the two parts. • Are their two contributions complementary? INTRODUCTION http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
Voiced and unvoiced parts of a speech signal Production of /z/: periodic contribution aperiodic contribution INTRODUCTION http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
speech waveform pitch extraction optimised pitch f0raw Nopt re-splicing pitch optimisation f0opt ^ ^ u(n) v(n) Pitch-scaled harmonic filter s(n) time shifting . . . PSHF PSHF PSHF aperiodic waveform periodic waveform METHOD http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
Decomposition example (waveforms) Original Periodic Aperiodic METHOD http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
Decomposition ex. (spectrograms) Original Periodic Aperiodic METHOD http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
Decomposition ex. (MFCC specs.) Original Periodic Aperiodic METHOD http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
Speech database: Aurora 2.0 • From TIdigits database of connected English digit strings (male & female speakers), filtered with G.712 at 8 kHz. TRAIN TEST METHOD http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
Description of the experiments • Baseline experiment: [base] • standard parameterisation of the original waveforms (i.e., MFCC,+Δ,+ΔΔ) • PCA experiments: [pca26, pca78, pca13 and pca39] • decorrelation of the feature vectors, and reduction of the number of coefficients • Split experiments: [split, split1] • adjustment of stream weights (periodic vs. aperiodic) Caveat: pitch values were derived from clean speech files, for entire database! METHOD http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
waveform features BASE: MFCC +Δ, +Δ2 SPLIT1: SPLIT: MFCC MFCC +Δ, +Δ2 +Δ, +Δ2 PSHF PSHF cat cat PCA26: MFCC PSHF +Δ, +Δ2 cat PCA PCA78: MFCC +Δ, +Δ2 PSHF cat PCA PCA13: MFCC +Δ, +Δ2 PSHF cat PCA PCA39: MFCC +Δ, +Δ2 PSHF cat PCA Parameterisations METHOD http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
Full-sized PCA results RESULTS http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
Variance of Principal Components PCA39 PCA26 • clean + multi RESULTS http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
PCA26 experiment’s results CLEAN MULTI
Summary of best PCA results RESULTS http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
Sample Split results Note: same value of stream weights used in training as in testing, for Split. RESULTS http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
Summary of PCA & Split results RESULTS http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
Conclusions • PSHF module split Aurora’s speech waveforms into two synchronous streams (periodic and aperiodic) • large improvements over the single-stream Baseline • Split was better than all PCA combinations: • PCA26/13 better than PCA 78/39, and PCA13 best • Split1 marginally better than Split • Periodic speech segments give robustness to noise. • Further work • Modeling: how best to combine the streams? • LVCSR: evaluate front end on TIMIT (phone recognition). • Robust pitch tracking CONCLUSION http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/
COLUMBO PROJECT: Harmonic decomposition applied to ASR Philip J.B. Jackson 1 <p.jackson@surrey.ac.uk> David M. Moreno 2 <davidm@talp.upc.es> Javier Hernando 2 <javier@talp.upc.es> Martin J. Russell 3 <m.j.russell@bham.ac.uk> http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ 1 2 3