230 likes | 466 Views
Semantic Inference at the Lexical-Syntactic Level. Roy Bar- Haim and Ido Dagan Computer Science Department Bar- Ilan University. Iddo Greental Linguistics Department Tel Aviv University. Eyal Shnarch Computer Science Department Bar- Ilan University. AAAI2007. Outline. Abstrast
E N D
Semantic Inference at the Lexical-Syntactic Level Roy Bar-Haim and IdoDagan Computer Science Department Bar-Ilan University IddoGreental Linguistics Department Tel Aviv University EyalShnarch Computer Science Department Bar-Ilan University AAAI2007
Outline • Abstrast • Introduction • Inference Framework • Rule for Generic Linguistic Structures • Lexical-Syntactic Rules • Evaluation • Result • Conclusion
Abstrast • 語義推論應用在理解自然語言是很重要的課題 • 作者提出一個可以直接在單一語句語法相依樹中做語義推論架構
Introduction • 近年來英文語系的PASCALRecognizing Textual Entailment比賽是一大挑戰 • 上述比賽需要辨識當一個假設(hypothesish)中資訊是否可以從一段話(t)中得知,稱之為 • t entails h
Introduction • 可能的實際應用在QA系統: • 問: Who killed Kennedy? • 轉換成h(假設): X killed Kennedy. • 到copus中找尋適合的句子證明假設成立: • The assassination of Kennedy by Oswaldshook the nation • 得知X是Oswald
Inference Framework • 本系統語義推論架構包含propositions(命題)和inference(entailment) rules • Propositions: t(assumption)->h(the goal) 在dependency tree 中首先抓取predicate在經由一連串的proof(利用entailment rules)證明t->h
Inference Framework • 語法相依樹(處理passiverule):
Inference Framework • inference(entailment) rules: • 組合兩個樣版L和R分別皆是dependency subtree • Lmatching: • 對於L中每一個結點u,存在一個function f 由L 到 s(source tree)擁有相同的feature值 • 對於L中每一個邊 u-> v存在 f(u) -> f(v) 在s有相同的相依關係
Inference Framework • R instantiation: 做完Lmatching後, R子樹複製L子樹中變數和root,交換root之外的變數位置
Inference Framework • Alignment copying : • L matching 和 Rinstantiation這兩個動作中只會抓取predicate 和 主詞以及受詞部分,最後將predict的直接children nodes重新加回 • Derived tree generation by rule type: • Substitution和 introduction兩種動作 • Substitution 例子: by a lexical rule, buy -> purchase
Inference Framework • Introduction:將一些不必要的節點(predicate的父親等等)忽略(如下圖) • Annotation Rules: • 應用在任何其它feature前,任意node中的兩種feature: • Negation and modality
Rule for Generic Linguistic Structures • Syntactic-Based Rules: • (1)簡化source tree : • Passive(被動式): • 原句:We have been approached by the investment banker. • 改成:The investment banker approached us. • Genitivemodifier(所有格修改): • Malaysia’s crude palm oil output is estimated to haverisen by up to six percent. • The crude palm oil output of Malasia is estimatedto have risen by up to six percent.
Rule for Generic Linguistic Structures • (2)只抽取部分資訊propositions: • Conjunctions(有連接詞): • Helena’s very experienced and has played a long timeon the tour. • Helena has played a long time on the tour. • Clausal modifiers • But celebrations were muted as many Iranians observeda Shi’ite mourning month. • Many Iranians observed a Shi’ite mourning month.
Rule for Generic Linguistic Structures • Relative clauses(關係子句): • The assailants fired six bullets at the car, which carried Vladimir Skobtsov. • The car carried Vladimir Skobtsov. • (3) • Appositives(同位語) • Frank Robinson, a one-time manager of theIndians,has the distinction for the NL. • Frank Robinson is a one-time manager of the Indians.
Rule for Generic Linguistic Structures • Polarity-Based Rules: • John knows that Mary is here->Mary is here. • John believes that Mary is here不能表示Mary is here. • (Nairn, Condoravdi, & Karttunen. 2006)利用動詞出現的上下文分析(正向,負面,未知),作者抽取有極性的動詞形成一個動詞列表另外加上兩個新聞文章常出現的動詞(say , announce)通常表達的意見是確定可靠的
Rule for Generic Linguistic Structures • 舉例: • Polarity(極性): • Yadav was forced to resign. • Yadav resigned. • Negation and Modality Annotation Rules: • Modal verbs:像是should, can, might… • Negation: • What we’ve never seen is actual costs come down. • What we’ve never seen is actual costs come down.
Rule for Generic Linguistic Structures • 其他: • Determiners(限定詞,限定名詞): • The plaintiffs filed their lawsuit last year in U.S. DistrictCourt in Miami. • The plaintiffs filed a lawsuit last year in U.S. DistrictCourt in Miami. • Generic Default Rules: • 刪除修飾語(mod)
Lexical-Syntactic Rules • Nominalization Rules: • These rules were derived automatically (Ron2006) from Nomlex (NOMinalization Lexicon) • 例: X’s acquisition of Y 產生 X acquiredY • Automatically Learned Rules: • DIRT (Lin & Pantel 2001) and TEASE (Szpektor et al.2004) are two state-of-the-art unsupervised algorithms that learn lexical-syntactic inference rules. • 例: Xfile lawsuit against Y 產生 X accuse Y
Evaluation • 不使用PASCALRTE資料集,因為數量較少而且已經有許多相似的語句對 • 替代方案是使用資訊擷取(RE)方式當作樣版,例:x buy y
Evaluation process • 需要產生相關的實驗樣版,所以先利用TEASE產生的動詞資源中選其中9個: • approach, approve, consult, lead, observe, play,seek, sign, strike. 對應RE的predicates • 下一步利用DIRT/TEASE與上述9個predicates學習出更多實驗樣版 • 例: X approve Y 產生 X confirm Y,
Evaluation process • 接下來利用Reuters RCV1 corpus找尋符合樣版中predicate的語句利用一系列的Generic Linguistic rules找尋最適合樣板的主詞和受詞 • 作者使用Minipar (Lin 1998) forparsing
Conclusion • 未來研究朝向多語句推論以及加入更多lexical rule 像是 dog-> animal