130 likes | 240 Views
Some Activities in Japan. Policy Considerations for Development and Deployment of Local Language Computing and Content. Hitoshi ISAHARA National Institute of Information and Communications Technology (NICT). JEITA (Japan Electronics and Information Technology Industries Association). GSK.
E N D
Some Activities in Japan Policy Considerations for Development and Deployment of Local Language Computing and Content Hitoshi ISAHARA National Institute of Information and Communications Technology (NICT)
JEITA (Japan Electronics and Information Technology Industries Association) GSK NII-SRC Knowledge Information Processing Technologies Committee NII: National Institute of Informatics NICT: National Institute of Information and Communications Technology Language Resource Sub-committee TCL Natural Language Processing Portal Site SHACHI: Language Resource Metadata DB Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos
Organizations for Language Corpora and Standardization Language Resource Association (GSK) National Institute for Japanese Language (NIJL) National Institute of Informatics – Speech Resources Consortium (NII-SRC) National Institute of Informatics – Test Collection for IR Systems (NTCIR) National Institute of Information and Communications Technology (NICT) ATR Spoken Language Communication Research Laboratories (ATR-SLC) NTT- Advanced Technology Corporation (NTT-AT) Nichigai-Associates, Incorporated Japan Electronics and Information Technology Industries Association (JEITA) Speech Input-Output Systems Standardization Expert Committee Knowledge Information Processing Technology Expert Committee Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos
Speech and Text Corpora in Japan GSK NIJL NTCIR SRC NICT ATR NTT-ATNichigai Total Speech 1 1 - 28 - 11 11 - 52 Sound - - - 2 - 1 1 - 4 Text 2 2 - - 7 2 - 16 29 T-collect. - - 26 - - - - - 26 Lexicon 2 4 - - 5 1 - 29 41 Tool - 5 - - - - - - 5 Total 5 12 26 30 12 15 12 45 157 Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos
JST Project with Prof. Mikami Construction of networks for Asian linguistic information technology resource (2005-2007) Standard contract forms Electronic version of IPAL dictionary of Japanese Tree-Bank of Urdu language International and domestic workshops organized. JST: Japan Science and Technology Agency Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos
R&D of Japanese-Chinese Machine Translation System • Five year national project from 2006 FY. • Our objectives are; Making scientific and technological information in China and other Asian countries easily usable in Japan. Promoting the distribution of documents to China and other countries about science and technology in which Japan is at the forefront. Contributing to the development of science and technology in Asian countries by the information exchange through machine translation. Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos
Linguistic Resources Dictionaries (Providing glosses for specialist terminology and general vocabulary, case frames, semantic systems) Corpora (With syntactic annotations, parallel corpora) Expansion of Linguistic resources Analysis engine Translation engine Scientific and technical papers Scientific and technical papers Chinese/Japanese Japanese/Chinese Example-based translation with deep consideration to syntactic structure Delivering a practical machine translation system within a new framework Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos
Language Grid Project with Prof. Ishida Language Grid is a new infrastructure which treats existing language service as atomic components and enables users to create new language services by combining appropriate components. It aims at increasing the accessibility and usability of language service. Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos
Role of the Language Grid More More Supporting Students from Overseas Translation Services at Hospital Receptions Sharing Multilingual Information Universal Playground Education Disaster Management Medical Care Language Support for Multicultural and Global Societies げ Language Grid Sharing language resources such as dictionaries and machine translators around the world German Research Center for Artificial Intelligence National Institute of Informatics Stuttgart University NICT National Research Council, Italy Chinese Academy of Sciences NTT Research Labs Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos Asian Disaster Reduction Center
Language Resources Web Services Language resources are registered as Web services equipped with standard interface based on the “language service ontology.” Language Resource Standard Interface Machine Translation Language Grid Morphological Analyzer Machine Translation ParallelTexts Community Dictionary Machine Translation Community Dictionary ParallelTexts Language Service User Available Machine Translations Machine Translation German As of June 2007 Machine Translation French Machine Translation Korean Machine Translation Machine Translation English Japanese Spanish Machine Translation Chinese Machine Translation Portuguese Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos Machine Translation Italian
Language Grid Language Service User Language Service User Language Service User Language Resource Provider Computation Resource Provider Language Grid Operator Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos
Language Grid Project (2/3) Two main benefits: • It makes possible to combine language resources or language processing functions. • It makes possible to add new-users’ own language resources to create new language services for their own intellectual activities. Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos
Language Grid Project (3/3) LGP launched in 2006 as a 5-year project at Kyoto University. LG engine was developed by NICT. GSK is expected to manage the system after the project. GSK must seek for financial support to maintain the activities of LGP. Regional Conference on Localized ICT Development and Dissemination across Asia Jan. 15, Vientiane, Laos