1 / 24

E-HowNet- a Lexical Knowledge Representation System

E-HowNet- a Lexical Knowledge Representation System. Keh-Jiann Chen Principal Investigator Core Platforms for Digital Contents Project, TELDAP Research Fellow Research Center for Information Technology Innovation & Institute of Information Science, Academia Sinica. Outline.

Download Presentation

E-HowNet- a Lexical Knowledge Representation System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. E-HowNet- a Lexical Knowledge Representation System Keh-Jiann Chen Principal Investigator Core Platforms for Digital Contents Project, TELDAP Research Fellow Research Center for Information Technology Innovation & Institute of Information Science, Academia Sinica

  2. Outline • What is E-HowNet? • E-HowNet- Sense Representation • Major Features • Current status of E-HowNet • Automatic Construction of Ontology • Apply the Framework to Metadata Representation of Digital Collections • Conclusion and Future Work

  3. What is E-HowNet? • E-HowNet is an entity-relation model for lexical semantic representation extended from HowNet. • The design of E-HowNet is for the purpose of automatic semantic composition and decomposition.

  4. E-HowNet- Sense Representation • Word sense definition- decompose a sense into simpler senses and sense relations • 果盤 fruit plate def:{plate|盤:telic={put|放置: location={~},patient={fruit|水果}}} • 玻璃盤 glass plate def: {plate|盤:material={glass|玻璃}} • 圓盤 round plate def: {plate|盤:shape={round|圓}}

  5. Principles for sense definitions • Use hypernym and prominent properties to define concepts. • Qualia structure- agentive, telic, formal, and constitutive • Use well-defined/primitive concepts and relations to define new concepts.

  6. Telic • 狗食 dog food • def: {食物: telic={餵:target={狗},patient={~}}} • def: { food|食品: telic={feed|餵: target={livestock|牲畜:telic={TakeCare|照料:patient={family|家庭},agent={~}}}, patient={~}}}

  7. Agentive • 早產兒premature baby • def: {嬰兒:agentive={早產:patient={~}}} • def: {human|人:age={child|少兒}, agentive={labour|臨產:manner={early|早}, patient={~}}}

  8. Formal • 彩霞rosy clouds • def: {CloudMist|雲霧:color={colored|彩}} • 酸辣湯spicy and sour soup • def: {湯:taste={酸}.and.{辣}} • def: {food|食品:material={StateLiquid|液態},taste={sour|酸}.and.{peppery|辣}

  9. Constitutive • 草裙grass skirt • def: {裙:material={草}} • def: {clothing|衣物:telic={PutOn|穿戴: instrument={~},location={leg|腿: whole={human|人:gender={female|女}}}}, material={FlowerGrass|花草}}

  10. Major Features • Lexical senses are expressed by either primitive concepts (sememes) or basic concepts. • Semantic relations are explicitly expressed in E-HowNet representations. • A uniform representation for function words, content words and phrases. • Taxonomy for both entities and relations. • Semantic composition and decomposition capabilities.

  11. Uniform representation and compositional semantics • Preposition: 把|ba def: goal={} • Noun: 文章|article def: {text|語文} • Verb: 寫好|have written • def: {write|寫:aspect={Vachieve|達成}} • Phrase: 把文章寫好|The article have been written. • {write|寫:goal={text|語文}, aspect={Vachieve|達成}}

  12. Taxonomy of E-HowNet • http://ehownet.iis.sinica.edu.tw • All|全 • entity|事物 • event|事件 • state|狀態 • Act|行動 • AttributeValue|屬性值 • object|物體 • thing|萬物 • time|時間 • space|空間 • relation|關係 • Semantic Role|語意角色 • function|函數

  13. Current status of E-HowNet • Coarse-grained E-HowNet sense representations for about 95,000 word-sense entries of CKIP Chinese dictionary. • About 45,000 different sense expressions • About 2,600 semantic primitives (sememes 義原) • About 200 semantic roles for objects • About 70 semantic roles for events • An automatic constructed ontology by appending and structuralizing all word senses to the HowNet top-level ontology.

  14. Automatic construction of ontology • Starting from the top-level ontology (modified from HowNet ontology) creates lower-level ontology by subsumption relations of E-HowNet expressions. • Attach lexical senses: Words and associated sense expressions are first attached to the top-level ontology nodes according to their head concepts. • Sub-categorization by attribute-values: Lexical concepts with the same semantic head are further sub-categorized (creates a new node) according to their attribute-values. • Repeat sub-categorization step: If there are many lexical concepts in one node with same extended feature values.

  15. Examples: • 衣衫, {clothing|衣物} • 木屐, {clothing|衣物:location={foot|腳},material={wood|木}} • 木鞋, {clothing|衣物:location={foot|腳},material={wood|木}} • 球鞋, {clothing|衣物:location={foot|腳},while={exercise|鍛鍊}} • 溜冰鞋, {clothing|衣物:location={foot|腳},while={slide|滑:location={ice|冰},purpose={exercise|鍛鍊:domain={sport|體育}}}} • 靴子, {clothing|衣物:location={foot|腳},length={LengthLong|長}} • 運動褲, {clothing|衣物:location={leg|腿},while={exercise|鍛鍊}} • 褲子, {clothing|衣物:location={leg|腿}} • 內衣, {clothing|衣物:qualification={private|私}} • 禮服, {clothing|衣物:qualification={formal|正式}} • 白紗, {clothing|衣物:qualification={formal|正式},owner={human|人:gender={female|女},predication={GetMarried|結婚:agent={~}}}} • 婚紗, {clothing|衣物:qualification={formal|正式},owner={human|人:gender={female|女},predication={GetMarried|結婚:agent={~}}}}

  16. Attach all lexical senses: • {clothing|衣物} [衣衫, 木屐, 木鞋, 球鞋, 溜冰鞋, 靴子, 運動褲, 褲子, 內衣, 禮服, 白紗, 婚紗]

  17. Sub–categorization by attribute-values: • {clothing|衣物} [衣衫] • 鞋子|shoes [木屐, 木鞋,球鞋, 溜冰鞋, 靴子] • 褲子|trousers [褲子, 運動褲] • 內衣|underwear [內衣] • 禮服|ceremonial robe/dress [禮服, 白紗,婚紗]

  18. Repeat sub-categorization step: • {clothing|衣物} [衣衫] • 鞋子|shoes [球鞋, 溜冰鞋, 靴子] • {木屐} [木屐,木鞋] • 褲子|trousers [褲子, 運動褲] • 內衣|underwear [內衣] • 禮服|ceremonial robe/dress [禮服] • {白紗} [白紗,婚紗]

  19. Apply the Framework to Metadata Representation of Digital Collections • 奉華紙槌瓶={瓷瓶:Time={北宋},Type={汝窯}} • 瓷瓶={瓶子:material={瓷} • 奉華紙槌瓶={瓶子: material={瓷}, Time={北宋},Type={汝窯}}

  20. Apply the Framework to Metadata Representation of Digital Collections • 青瓷水仙盆={瓷盆:Time={北宋},Type={汝窯}, Telic={水仙}} • 瓷盆={盆:material={瓷}} • 青瓷水仙盆={盆:material={瓷} , Time={北宋}, Type={汝窯} }, Telic={水仙}}

  21. Apply the Framework to Metadata Representation of Digital Collections • 奉華紙槌瓶={瓶子: material={瓷},Time={北宋},Type={汝窯}} • 青瓷水仙盆={盆:material={瓷} ,Time={北宋}, Type={汝窯} }, Telic={水仙}}

  22. Conclusion and Future Works • E-HowNet sense representations are updated from time to time. • The ontology can be rebuilt automatically based on the refined expressions. • New categories in the taxonomy can be identified and characterized by their specific attribute-values. • Uniform representations of function words and content words facilitate semantic composition and decomposition. • Because of E-HowNet’s semantic decomposition capability, the primitive representations for surface sentences with the same deep semantics are nearly canonical.

More Related