170 likes | 265 Views
Nancy Meeting – 6-7 July 2006. Advances in WP2. www.loquendo.com. Recent Work on NN Adaptation in WP2. State of the art LIN adaptation method implemented and experimented on the benchmarks (m12) Innovative LHN adaptation method implemented and experimented on the benchmarks (m21)
E N D
Nancy Meeting – 6-7 July 2006 Advances in WP2 www.loquendo.com
Recent Work on NN Adaptation in WP2 • State of the art LIN adaptation method implemented and experimented on the benchmarks (m12) • Innovative LHN adaptation method implemented and experimented on the benchmarks (m21) • Experimental results on benchmark corpora and Hiwire database with LIN and LHN (m21) • Further advances on new adaptation methods (m24)
…. LIN LIN Adaptation Acoustic phonetic Units Emission Probabilities …. Output layer Speaker Independent MLP SI-MLP …. 2nd hidden layer 1st hidden layer …. Input layer Speech Signal parameters
…. LHN LHN Adaptation Acoustic phonetic Units Emission Probabilities …. Output layer Speaker Independent MLP SI-MLP …. 2nd hidden layer 1st hidden layer …. Input layer Speech Signal parameters
Papers presented: • Roberto Gemello, Franco Mana, Stefano Scanzio, Pietro Laface, Renato De Mori, “Adaptation of Hybrid ANN/HMM models using hidden linear transformations and conservative training”, Proc. of Icassp 2006, Toulouse, France, May 2006 • Dario Albesano, Roberto Gemello, Pietro Laface, Franco Mana, Stefano Scanzio, “Adaptation of Artificial Neural Networks Avoiding Catastrophic Forgetting”, Proc. of IJCNN 2006, Vancouver, Canada, July 2006
The “Forgetting” problem in ANN Adaptation • It is well known, in connectionist learning, that acquiring new information in the adaptation process can damage previously learned information (Catastrophic Forgetting) • This effect must be taken into account when adapting an ANN with limited amount of data, which do not include enough samples for all the classes. • The “absent” classes may be forgotten during adaptation as the discriminative training (Error Backpropagation) assigns always zero targets to absent classes
“Forgetting” in ANN for ASR • While Adapting ASR ANN/HMM model, this problem can arise when the adaptation set does not contain examples for some phonemes, due to the limited amount of adaptation data or the limited vocabulary • The ANN training is discriminative, contrary to that of GMM-HMMs, and absent phonemes will be penalized by assigning to them a zero target during the adaptation • That induces in the ANN a forgetting of the capability to classify the absent phonemes. Thus, while the HMM models for phonemes with no observations remain un-adapted, the ANN output units corresponding to phonemes with no observations loose their characterization, rather than staying not adapted
5.0 5.0 4.0 4.0 I E F2 (kHz) 3.0 3.0 A e 2.0 2.0 1.0 1.0 U O 0.5 0.5 F1 (kHz) 0.0 0.0 0.5 0.5 1.0 1.0 1.5 1.5 Example of Forgetting Adaptation examples only of E, U, O (e.g. from words: uno, due, tre); no examples for the other vowels (A, I, ə ) The classes with examples adapt themselves, but tend to invade the classes with no examples, that are partially “forgotten” I E F2 (kHz) A e U O F1 (kHz)
Conservative policy Standard policy “Conservative” Training • We have introduced “conservative training” to avoid the forgetting of absent phonemes • The idea is to avoid zero target for the absent phonemes, using for them the output of the Original NN as target; Let be FP the set of phonemes present in the adaptation set and FA the set of absent ones. The target are assigned according to the following equations:
Standard target assignment policy Posterior probability computed using the original network A1 P1 P2 P3 A2 • 0.03 0.00 0.95 0.00 0.02 • 0.00 0.00 1.00 0.00 0.00 Conservative Training target assignment policy P2 is the class corresponding to the correct phoneme Px: class in the adaptation set Ax: absent class
“Conservative” Training • In this way, the phonemes that are absent in the adaptation set are “represented” by the response given by the Original NN • Thus, the absent phonemes are not “absorbed” by the neighboring present phonemes • The results of adaptation with conservative training are: • Comparable performances on target environment • Preservation of performances on generalist environment • Great improvement of performances in speaker adaptation, when only few sentences are available
Adaptation tasks • Application data adaptation:Directory Assistance • 9325 Italian city names • 53713 training + 3917 test utterances • Vocabulary adaptation: Command words • 30 command words • 6189 training + 3094 test utterances • Channel-Environment adaptation: Aurora-3 • 2951 training + 654 test utterances
Mitigation of Catastrophic Forgetting using Conservative Training Tests using adaptedmodels on Italian continuous speech (% WER)
Conclusions • The new LHN adaptation method, developed within the project, outperforms standard LIN adaptation • In adaptation tasks with missing classes, Conservative Training reduces the catastrophic forgetting effect, preserving the performance on another generic task
Workplan • Selection of suitable benchmark databases (m6) • Baseline set-up for the selected databases (m8) • LIN adaptation method implemented and experimented on the benchmarks (m12) • Experimental results on Hiwire database with LIN (m18) • Innovative NN adaptation methods and algorithms for acoustic modeling and experimental results (m21) • Further advances on new adaptation methods (m24) • Unsupervised Adaptation: algorithms and experimentation (m33)