150 likes | 328 Views
Initiation of Standardization on Network-based Speech-to-speech Translation at ITU-T SG16 National Institute of Information and Communications Technology, Japan Satoshi Nakamura Chiori Hori. Many Languages All Over the World. http://en.wikipedia.org/wiki/List_of_language_families.
E N D
Initiation of Standardization on Network-based Speech-to-speech Translation at ITU-T SG16 National Institute of Information and Communications Technology, Japan Satoshi Nakamura Chiori Hori
Many Languages All Over the World http://en.wikipedia.org/wiki/List_of_language_families
Breaking Language Boundaries • Language boundaries is one of the causes of barriers to mutual understanding. • To remove language boundaries between people who speak different languages, Speech-to-Speech Translation (S2ST) technologies are an effective means of communication. • S2ST technologies have been studied.
Speech-to-Speech Translation (S2ST) Speech Recognition (ASR) Machine Translation (MT) Speech Synthesis (TTS) English “I go to school” Japanese 「私は学校に行く」 w a t a shi w a g a xtu k o o n i….. I to school go 私は 学校に行く I go to school Convert to Japanese phoneme sequence “w”, “a”, “t”… Convert to word sequence using lexicon and grammar Convert to English word sequence 「私は」⇒ “I” 「学校に」⇒“to school” 「行く」⇒“go” Reorder word sequences according to English grammar “I” “I” “to school” “go” “go” “to school” Select appropriate waveform for English text Corpora
Stand Alone and Client-server S2ST Systems Stand alone system Client-serversystem Japanese speech “おはようございます.” Packages the entire speech translation functions into a handheld PC Indonesian Japanese English speech “Good morning.” Chinese English
Why Network-based? • Resource limitation in stand alone systems and language pairs are limited. • ASR/MT/TTS systems for many languages are available and needs to be maintained by each country. • Broadband network is available.
Standardization on Network-based S2ST Speech of Language B Speech of Language A Synthesized Speech Synthesized Speech S2ST Client S2ST Client Parallel corpus, Speech data, lexicon Standardization ASR ASR Lexicon Speech data Lexicon Speech data Language B Language A Data format for ASR and MT results MT MT Parallel corpus lexicon Parallel corpus lexicon Language A Language B Language B Language A Communication protocol among modules TTS TTS Lexicon Speech data Lexicon Speech data Language A Language B
Lexicon for overall S2ST systems • The global standardization for lexicon format and a system to collect and provide lexicon for all languages is requisite to maintaining reliable lexicon for overall S2ST systems. An example of a lexicon for overall modules in S2ST systems
Asian Network-Based S2ST System by A-STAR Consortium 1National Institute of Information and Communications Technology (NICT), Japan 2Electronics and Telecommunications Research Institute (ETRI), Korea 3Chinese Academy of Sciences (CASIA), China 4National Electronics and Computer Technology Center (NECTEC), Thailand 5Agency for the Assessment and Application of Technology (BPPT), Indonesia 6Center for Development of Advance Computing (CDAC), India 7Institute of Information Technology (IOIT), Vietnam 8Institute for Infocomm Research (I2R), Singapore
Speech Translation using Distributed Service Servers Example: From Korean to ThaiSpeech Translation ① Speech recognition (Korean) ASR server ② Language translation (Korean→Thai) Text (Korean) Speech (Korean) Speech translation service client TTS server MT server Translated text (Thai) MT server Synthesized speech (Thai) ③ Speech synthesis (Thai) TTS server ASR server
Scope of Standardization Table : Draft Roadmap to develop standards for network-based S2ST
Conclusion • We would like to invite more people to standardization activities on network-based S2ST systems. • By leveraging the standardization, network-based S2ST systems can cover more languages.