1 / 16

Research Report

Research Report. Semantic Web Research Center Duhyeon Jin 2011-5-6. Contents. Last Discussion How can extract argument patterns? Josa & thematic role Current Progress. Last discussion. 1. Show raw argument data of Treebank  not N1, N2.. Show every connected args with its ‘ josa ’

dara-craft
Download Presentation

Research Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Report Semantic Web Research Center Duhyeon Jin 2011-5-6

  2. Contents • Last Discussion • How can extract argument patterns? • Josa & thematic role • Current Progress

  3. Last discussion • 1. Show raw argument data of Treebank  not N1, N2.. Show every connected args with its ‘josa’ • 2. Decide representative ‘josa’ and find derivative ‘josa’ • 3. Classify ‘josa’ with the specific purpose(Think the translation case)

  4. How can extract argument patterns? • Sejong Treebank has only 4 argument types • No other types of arguments Predicate 생각했다 NP_CMP NP which can be subject NP_SBJ NP which can be subject 예쁘다고 NP_AJT NP which can be subject 철수가 NP_OBJ NP which can be subject 예전에 영희를

  5. 생각하다 NP_SBJ 형/은 S_CMP 인생이…/다고

  6. What kind of Korean stems are involved to this syntactic category? Extractedfrom Sejong Treebank • NP_SBJ •  이/가, 은/는, 께서, 도, 이란, 이야말로, 만, 까지, 마다, 나, 란, 조차, 밖에, 부터, 두, 이나, 마저, 와 • NP_OBJ •  을/를, 조차,까지, 도, 은/는, 이나마, 마저, 나/이나, 두, 부터, 과 • NP_AJT •  으로, 에서, 도, 로써, 에, 은/는, 로, 로부터, 밖에, 과, 처럼, 에서부터, 부터, 만, 으로서, 까지, …… • NP_CMP • 라고, 다고, 이라고, 라, 라며, 리라, ….

  7. Related researches on Korean ‘josa’ and thematic role • 조정미· 김길창(1996), 한국어 의미 해석 시 중의성 해소에 대한 연구 • 강신재 · 박정혜(2003), 대규모 말뭉치와 전산 언어 사전을 이용한 의미역 결정 규칙의 구축 • 임동훈(2004) 한국어 조사의 하위 부류와 결합 유형  Syntacticstems, semantic stems, postpostions, delimiters and combinations of stems Syntactic Semantic Delimiters, etc 이/가, 을/를 에, 에게, 에서, 으로/로, 와/과 만, 까지, 은/는, 나, 도…

  8. Postpositions, Delimiters and particles • Postpositions: 만, 까지(to), 다가, 밖에, 부터(from), 조차, 처럼(like), 같이(like), 보다(than), 만큼, 뿐, 대로..  can make an adjunct and have a thematic role • Particles(delimiters): 은/는, 이야/야, 도, 이나/나, 이라도/라도..  can be ignored at case and thematic role level

  9. Cho & Kim 1996

  10. Kang & Park(2003) • Agent : 이/가, 에서, 에게 • Theme : 이/가, 을/를, 에 • Experiencer: 이/가, 에게 • Companion: 와/과 • Source: 에서 • Goal: 에, 로/으로, 에게 • Instrument: 로/으로 • Reason: 에, 로/으로 • Reciepient: 이/가, 에게 • Appraisee: 로/으로 • Criterion: 에서, 와/과 • Degree: 만큼 • Direction: 로/으로 • Time: 에, 로/으로 • Path: 로/으로 • Material: 로/으로

  11. The List of thematic roles inCoreNet In the previous version of CoreNetcaseframe database, Supposed following thematic roles. Minimum number of ‘josa’s are presented. 이/가, 로/으로, 에서, 에게서, 에, 을, 에게,한테, 와, 에 대해, 보다, 만, 로부터, 라고, 를 향해,

  12. Josa & Thematic role matrix

  13. Allcases of josa in the data • Excluding delimiters, we can find 61 types of josaand combinations of josa. • There will be more combinations with the more data.

  14. Allcases of josa in the data

  15. example • Constructed patterns are  A가 B를동반하다  A가 B를 C에 동반하다.

  16. Current progress • 78 verbs for 675 verbs • Planning to construct 20 verbs per a day

More Related