100 likes | 233 Views
Weekly Report. 2010. 2. 17 Duhyeon Jin Semantic Web Research Center. Contents. Last issues To-do-list Works. Last Issues. The scope of words to give case frames All verbs in CoreNet which don’t have case frames. Predicate nominal ( 서술성 명사 ) Doing the experiment
E N D
Weekly Report 2010. 2. 17 DuhyeonJin Semantic Web Research Center
Contents • Last issues • To-do-list • Works
Last Issues • The scope of words to give case frames • All verbs in CoreNet which don’t have case frames. • Predicate nominal (서술성명사 ) • Doing the experiment • Make a model to construct Case frames
To-do-list • Selecting word entries • Experiment • Extracting usages • Doing the construction • Selecting sample words • Assigning an appropriate word sense • Assigning an appropriate concept to arguments • Calculate time duration, and make model • Make the problem specification • Make instruction for case frame construction
Works: Selecting word entries • Selected 2,308 words Korean verbs 3,200 word senses from ‘현대 국어 사용 빈도 조사(2002), 국립 국어원’ Headwords2,014 Entries in CoreNet: 1,021 CorNet verbs(no case frame) 1,593 Predicate nominal 675 CorNet Adjectives(no case frame) 40
Works: Extracting Usages • Extracting POS-tagged sentences. • From ‘Sejong POS-tagged corpus’ (1,006,777 sentences, 969MB) • With selected words (2,308 words) • Using algorithm in Java, in local computer.
Works: Extracting Usages • Problems • Considerthe case of ‘predicate nominal’ + ‘되’ or ‘predicate nominal’+‘시키’ • Can we handle with 500 usages per a word? • Must reduce trivial usages or make a limit
Works: Doing the construction • Model 1 (experimented) • Manual construction • Using Text editor + Spread sheet + CoreNet Browser • Time duration: 25 usages in 30 min. • Extraordinary time consuming. • Model 2 (assumption) • Manual construction with tools • Database + tool + CoreNet lib. • Time duration(assumption): 180 usages in 30 min. • Suppose 100 usages per one word: • Model 3 • Automatic case frame extraction • Must survey articles or need help of someone
Works: Instructions for case frame construction • Making instructions on the web • http://sysx2.kaist.ac.kr/wiki/index.php/격틀구축지침 • Issues: • Modifing clauses: 개혁을강조한 사람 • '하다', '되다', '시키다'의 통사적 차이 예> '통과하다'의 경우 • 기업(NOM)이 심사(ACC)를 통과하다 • 기업(NOM)이 심사(DAT)에 통과되다 • 기업(ACC)을 심사(DAT)에 통과시키다.
Plan • To Finish organizing database for usages. • Making a tool for construction using database and CoreNet library