170 likes | 190 Views
Explore the innovative PAPILLON system developed by Mathieu Lafourcade at LIRMM, France in 2002. Learn about the concept of conceptual vectors, bilingual dictionaries, and their heuristics for linking, decision-making, and quality evaluation. Discover how this system populates a lexical database through the integration of bilingual dictionary data and semantic vectors. The system aims to enhance lexical knowledge negotiation with continuous learning and evolving results, creating potential for a community of lexical agents.
E N D
Automatically Populating Acception Lexical Database through Bilingual Dictionaries and Conceptual Vectors PAPILLON 2002 Mathieu Lafourcade LIRMM - France
Overwiew & Objectives Lexical soup what ? Bilingual dic & Conceptual vectors which heuristics ? for what ? linking decision and quality assessment
Conceptual vectorsvector space • An idea Concept combination — a vector • Idea space = vector space • A concept = an idea = a vector V with augmentation: V + neighboorhood • Meaning space = vector space + {v}*
Conceptual vectors Thesaurus • H : thesaurus hierarchy — K concepts Thesaurus Larousse = 873 concepts • V(Ci) : <a1, …, ai, … , a873> aj = 1/ (2 ** Dum(H, i, j)) 1/16 1/16 1/4 1 1/4 1/4 1/64 1/64 4 2 6
Conceptual vectors Conceptc4:peace peace conflict relations hiérarchical relations society The world, manhood
Conceptual vectors Term “peace” c4:peace
Angular distance • DA(x, y) = angle (x, y) • 0 DA(x, y) • if 0 then x & y colinear — same idea • if /2 then nothing in common • if then DA(x, -x) with -x — anti-idea of x x’ x y
Angular distance DA(x, y) = acos(sim(x,y)) DA(x, y) = acos(x.y/|x||y|)) DA(x, x) = 0 DA(x, y) = DA(y, x) DA(x, y) + DA(y, z) DA(x, z) DA(0, 0) = 0 and DA(x, 0) = /2 by definition DA(x, y) = DA(x, y) with 0 DA(x, y) = - DA(x, y) with < 0 DA(x+x, x+y) = DA(x, x+y) DA(x, y)
Thematic distance • Examples • DA(tit, tit) = 0 • DA(tit, passerine) = 0.4 • DA(tit, bird) = 0.7 • DA(tit, train) = 1.14 • DA(tit, insect) = 0.62 tit = insectivorous passerine bird …
Conceptual vector base v v v v v v English entry base French entry base fleuve river rivière acc y map acc carte.1 carte.2 card.1 acc x Malay entry base Japanese entry base acc z sungai acc x Acception base
Conceptual vector base v v v v v v French entry base v fleuve v fleuve v rivière v rivière acc x v v carte.2 carte.2 v carte.1 v carte.1 acc y acc x empty acc z Acception base
equivalents equivalents equivalents equivalents equivalents glosses glosses glosses glosses glosses demand.1 demand.2 demand.5 demand.3 demand.4 demand.1 demand.1 demand.1 demand.2 demand.1 demand.3 demand.1 demand.4 def def def def def def def def v v v v v v v v English Vectorized monolingualdictionary English-French Bilingualdictionary Left over meaning demand Association v demand v v v v Vector space Equivalents 2 Glosses 2 Equivalents 2 Glosses 2 Equivalents 2 Glosses 2 Equivalents 2 Glosses 2 Associations between definitions, vectors, glosses and equivalents
Source Language Target Language Acceptions already existing link created link W-SL i equiv = {W-TL} W-TL equiv = {W-SL, …} warning if not close vectors are close W-SL i {W-TL} v {W-SL, …} W-TL v W-TL {W-SL, …} v {W-SL, …} W-TL v left over acceptions W-TL {…}
Source Language Acceptions Target Language W-TL1 Equiv = {W-SL, …} v v W-TL1 Equiv = {W-SL, …} v W-TL1 Equiv = {W-SL, …} W-TL1 Equiv = {…} W-SL i v {W-TL1,W-TL2, …} W-TL2 Equiv = {W-SL, …} v v W-TL2 Equiv = {W-SL, …} v W-TL2 Equiv = {W-SL, …} W-TL2 Equiv = {…} …
equiv = W-TL W-SL i createdacception equiv = W-TL W-SL i refinement links equiv = W-TL W-SL i equiv = W-TL W-SL i closest vector equiv = W-TL W-SL i equiv = W-TL W-SL i
equiv = W-TL W-SL i equiv = W-TL W-SL j equiv = W-TL W-SL j 1 equiv = W-TL W-SL i equiv = W-TL W-SL j equiv = W-TL W-SL j 2 equiv = W-TL W-SL i equiv = W-TL W-SL j equiv = W-TL W-SL j closest vector
Conclusion • System in continuous learning • Evolving results • Hopefully converging • Assisting and begin assisted by • Vectorized lexical functions • Human annotators • Toward • Community of lexical agents • Lexical knowledge negotiation