210 likes | 235 Views
Explore the evolution of voice morphing techniques, from historical origins to AR-HMM analysis and re-synthesis. Learn how vocal tract area functions are crucial for smooth voice transformation. Discover the applications of voice morphing in various fields. Uncover the training and conversion phases that enable seamless voice modulation. Gain insights into the potential of voice morphing technology and its future prospects for improved conversion quality.
E N D
G.S.MOZE COLLEGE OF ENGINNERINGBALEWADI,PUNE -45. A PRESENTATION ON Voice Morphing PROJECT GUIDE : By: Anil Mahadik Prof. SonaliGhote
Content • Title • Introduction • History • Need of Vocal track area function • Vocal track area function • AR-HMM Analysis • AR-HMM Diagram • Re-synthesis of Converted voice
Training Phase • Conversion and morphing phase • Application • Conclusion • References
Title • The Project title is “Voice Morphing”. • Give the information about Flexible Voice Morphing based on linear combination of multispeakers’ vocal tract area function. • Voice morphing or voice conversion usually means transformation from a source speaker’s speech to a target speaker’s.
Introduction • The main goal of the developed audio morphing methods is the smooth transformation from one sound to another. • These techniques are considered to be a kind of point-to-point mapping in a feature space. • There are many applications which may benefit from this sort of technology. • Research on voice morphing aims to extend this restriction to area-to-area mapping by introducing multi-speakers .
History • Voice morphing is a technology developed at the Los Alamos National Laboratory in New Mexico, USA by George Papcun and publicly demonstrated in 1999. • Voice morphing enables speech patterns to be cloned and an accurate copy of a person's voice be made which can then say anything the operator wishes it to say.
Need of Vocal track area function • Since the 1990s, many techniques for voice conver-sion have been proposed [1-7]. • One successful technique is to use a statistical method for mapping a source speaker’s voice to a target speaker’s but a weakness of these methods is the discontinuity of formants. • The proposed method employs an estimated vocal tract area function to avoid such weakness.
Vocal Tract area function(A) • Interpolation in the vocal tract area domain is considered to provide reasonably continuous transition of formants. • Estimation of the vocal tract area function implies simultaneous estimation of the voice source characteristics.
AR-HMM analysis • For this purpose of Estimation of the vocal tract area function introduce Auto-Regressive Hidden Markov Model (AR-HMM) analysis of speech. • The AR-HMM model represents the vocal tract characteristics by an AR model and the glottal source wave by an HMM. • The AR-HMM analysis estimates the vocal tract resonance characteristics and vocal source waves in the sense of maximum likelihood estimation.
Re-synthesis of the converted voice • There are two phase’s Training phase and Conversion & Morphing phase. • The procedure of each phase is as follow in Diagram.
Training phase • AR-HMM analysis: Speech samples with the same phonetic content from both source and target speaker are analyzed . • Feature alignment: The feature vectors obtained above are time-aligned using dynamic time warping (DTW) in order to compensate for any differences in duration between source and target utterances. • Estimation of the conversion function: The aligned vectors are used to train a joint GMM whose parameters are then used to construct a stochastic conversion function.
Conversion and morphing phase • AR-HMM analysis: In this case only the source speaker’s utterances are used. • Features Transformation: The GMM-based transfor-mation function constructed during training is now used for converting every source log vocal tract area function and vocal cord cepstrum into its most likely target equivalent. • Linear Interpolation ,Synthesis of the source wave and LPC synthesis.
Application • Applications as the creation of peculiar voices in animation films. • Voice morphing has tremendous possibilities in military psychological warfare and subversion. • Voice morphing is a powerful battlefield weapon which can be used to provide fake orders to the enemy's troops, appearing to come from their own commanders.
Conclusion • This paper has presented a voice morphing method based on mappings in the vocal tract area space and glottal source wave spectrum that can each be independently mod-ified. • These features have been realized using AR-HMM analysis of speech. • In future, we will investigate how to improve the quality of voice conversion with interpolation techniques.
References • [1] L.M. Arslan, D.Talkin, ”Voice conversion by codebook map-ping of line spectral frequencies and excitation spectrum,” Proc. Eurospeech, pp.1347-1350, 1997. • [2] Y.Stylianou, O.Cappe, “A system voice conversion based on probabilistic classification and a harmonic plus noise mod-el”, Proc.ICASSP, pp.281-284, 1998 . • [3] A.Kain, “Spectral voice conversion for text-to-speech syn-thesis”, Proc.ICASSP pp.285-288, 1998. • [4] H. Ye, S. Young, “High Quality Voice Morphing”, in Proc.IEEEICASSP, pp.9-12, 2004.