1 / 10

Weekly Report

Weekly Report. 2010. 2. 17 Duhyeon Jin Semantic Web Research Center. Contents. Last issues To-do-list Works. Last Issues. The scope of words to give case frames All verbs in CoreNet which don’t have case frames. Predicate nominal ( 서술성 명사 ) Doing the experiment

qamra
Download Presentation

Weekly Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Weekly Report 2010. 2. 17 DuhyeonJin Semantic Web Research Center

  2. Contents • Last issues • To-do-list • Works

  3. Last Issues • The scope of words to give case frames • All verbs in CoreNet which don’t have case frames. • Predicate nominal (서술성명사 ) • Doing the experiment • Make a model to construct Case frames

  4. To-do-list • Selecting word entries • Experiment • Extracting usages • Doing the construction • Selecting sample words • Assigning an appropriate word sense • Assigning an appropriate concept to arguments • Calculate time duration, and make model • Make the problem specification • Make instruction for case frame construction

  5. Works: Selecting word entries • Selected 2,308 words Korean verbs 3,200 word senses from ‘현대 국어 사용 빈도 조사(2002), 국립 국어원’ Headwords2,014 Entries in CoreNet: 1,021 CorNet verbs(no case frame) 1,593 Predicate nominal 675 CorNet Adjectives(no case frame) 40

  6. Works: Extracting Usages • Extracting POS-tagged sentences. • From ‘Sejong POS-tagged corpus’ (1,006,777 sentences, 969MB) • With selected words (2,308 words) • Using algorithm in Java, in local computer.

  7. Works: Extracting Usages • Problems • Considerthe case of ‘predicate nominal’ + ‘되’ or ‘predicate nominal’+‘시키’ • Can we handle with 500 usages per a word? • Must reduce trivial usages or make a limit

  8. Works: Doing the construction • Model 1 (experimented) • Manual construction • Using Text editor + Spread sheet + CoreNet Browser • Time duration: 25 usages in 30 min. • Extraordinary time consuming. • Model 2 (assumption) • Manual construction with tools • Database + tool + CoreNet lib. • Time duration(assumption): 180 usages in 30 min. • Suppose 100 usages per one word: • Model 3 • Automatic case frame extraction • Must survey articles or need help of someone

  9. Works: Instructions for case frame construction • Making instructions on the web • http://sysx2.kaist.ac.kr/wiki/index.php/격틀구축지침 • Issues: • Modifing clauses: 개혁을강조한 사람 • '하다', '되다', '시키다'의 통사적 차이 예> '통과하다'의 경우 • 기업(NOM)이 심사(ACC)를 통과하다 • 기업(NOM)이 심사(DAT)에 통과되다 • 기업(ACC)을 심사(DAT)에 통과시키다.

  10. Plan • To Finish organizing database for usages. • Making a tool for construction using database and CoreNet library

More Related