1 / 19

Infrastructures in Taiwan and for the Chinese Languages Chu-Ren Huang Institute of Linguistics

Infrastructures in Taiwan and for the Chinese Languages Chu-Ren Huang Institute of Linguistics Academia Sinica churen@sinica.edu.tw ACL 2000 WORKSHOP: Infrastructures for Global Collaboration Saturday, October 7, Hong Kong. Types of Infrastructures

dorie
Download Presentation

Infrastructures in Taiwan and for the Chinese Languages Chu-Ren Huang Institute of Linguistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Infrastructures in Taiwan and for the Chinese Languages Chu-Ren Huang Institute of Linguistics Academia Sinica churen@sinica.edu.tw ACL 2000 WORKSHOP: Infrastructures for Global Collaboration Saturday, October 7, Hong Kong

  2. Types of Infrastructures Sharable resources(for Chinese computational linguistics) Mechanisms for international collaboration Mechanisms for scholarly exchange

  3. Host Institutes • -The Association for Computational Linguistics and Chinese Language Processing • (ACLCLP, a.k.a. ROCLING) • -Academia Sinica • -National Science Council (NSC)

  4. Sharable Resources for Chinese Computational Linguistics • Corpora • Lexicons • Procedures • http://rocling.iis.sinica.edu.tw/ROCLING/

  5. Sharable Resources for Chinese Computational Linguistics--Corpora • -Academia Sinica Balanced Corpus of Mandarin Chinese (Sinica Corpus) • -Sinica Treebank • -Standard Segmentation Corpus • -ROCLING Corpus • -Mandarin-Across-Taiwan (MAT) Speech Database

  6. Academia Sinica Balanced Corpus of Mandarin Chinese (Sinica Corpus) • 5 million words, segmented and tagged • Direct WWW Access • -http://www.sinica.edu.tw/~tibe/2-words/modern-words/index.html OR • -http://www.sinica.edu.tw/ftms-bin/kiwi.sh • License Information • -http://rocling.iis.sinica.edu.tw/ROCLING/corpus98/sinicor_E.htm

  7. Sinica Treebank 1.0 38,725 Trees 239,532 Words Direct WWW Access (1000 sample trees) http://godel.iis.sinica.edu.tw/CKIP/trees1000.htm License Information http://rocling.iis.sinica.edu.tw/ROCLING/Treebank/Treebank-E.htm

  8. Mandarin-Across-Taiwan (MAT) • Speech Database • Speech files are collected through telephone networks. The content Includes spontaneous speech (short answering statements) and read speech (numbers, Mandarin syllables, words of 2 to 4 syllables, phonetically balanced sentences). • MAT-160 (160 speakers) • MAT-2000 • http://rocling.iis.sinica.edu.tw/ROCLING/MAT/index_cf.htm

  9. Sharable Resources for Chinese Computational Linguistics-Procedures Segmentation Standard for Chinese Language Processing Segmentation Standard http://godel.iis.sinica.edu.tw/ROCLING/juhuashu1.htm Standard Segmentation Corpus (2 million words, segmented) http://godel.iis.sinica.edu.tw/ROCLING/corpus98/segcorp_E.htm Standard Segmentation Lexicon (42,138 entries, w/ frequency) http://godel.iis.sinica.edu.tw/ROCLING/corpus98/segdic_E.htm Segmentation Program (free download) http://godel.iis.sinica.edu.tw/CKIP/ws/

  10. Sharable Resources in Languages • Other than Modern Mandarin • Classical Chinese Corpora • http://www.sinica.edu.tw/~tibe/2-words/old-words/index.html • Corpus of Formosan Austronesian Languages • Under construction, part of the National • Digital Archive Initiative • Lexical Databases of other Sino-Tibetan and • Tibeto-Burmese Languages

  11. Mechanisms for • International Collaboration • Major Sponsors of International Collaboration Involving Taiwan • --The Chiang Ching-kuo Foundation for International Scholarly Exchange • http://www.cckf.org http://www.cckf.org.tw • --The National Science Council • --Academia Sinica

  12. Synchronic and Diachronic • Chinese Corpora • Three Projects Sponsored by the CCK Foundation (1990-1995) • Chu-Ren Huang, Keh-jiann Chen and Pei-chuan Wei, Academia Sinica • Paul Thompson, SOAS, University of London • Chaofen Sun, Stanford University

  13. Mechanisms for Scholarly Exchange and Collaboration • Department of International Programs, NSC • http://www.nsc.gov.tw/int/2_cooperation/index_02.html • Canada: NRC France: CNRS Japan: EAACST • Germany: DFG, DAAD, DKFG • Netherlands: NWO, IIAS • USA: NSF, NIH • UK: Royal Society of London, ETC

  14. A NSF/NSC International Joint Project • NSF:Asian Language Digital Library Project • Ching-Chih Chen, Simmons College • NSC International Digital Library Collaborative Projects • --Lexicon-based Knowledge Linking -Approaches Towards a WordNet Infrastructure for Multilingual Digital Library • Chu-Ren Huang, Academia Sinica • --Linguistic Technology and Resources for English-Chinese Bilingual Information System • Hsin-Hsi Chen, National Taiwan University

  15. Mechanisms for International Collaboration-Bilateral Projects • -Case by Case Negotiation • Academia Sinica vs. Hong Kong Chinese University, LDC, Stanford, UCSB etc.

  16. Mechanisms for Scholarly Exchange-Conferences • ROCLING (annually since 1988) • PACLIC [Pacific Asia Conference on Language Information and Computation] • (regional conference involving Hong Kong, Japan, Korea, Singapore, and Taiwan) • http://www.rcl.cityu.edu.hk/paclic15 • COLING2002 • http://www.COLING2002.sinica.edu.tw

  17. Mechanisms for Scholarly Exchange-Exchange Scholars • Academia Sinica and EHESS: Yearly exchange • Academia Sinica and University of Pennsylvania (under negotiation) • NSC and CNRS, NSC and NWO: Cognitive Science

  18. Mechanisms for Scholarly Exchange-Post-doctoral Fellows -Academia Sinica Post-doctoral Fellowships Application through Project PI’s or directly by applicants -NSC Post-doctoral Fellowships

  19. Mechanisms for Scholarly Exchange-International Students • Computational Linguistics and Chinese Language Processing • An international graduate (PhD) program (Proposal under review) • Visiting Students • Internships

More Related