170 likes | 188 Views
Automatically Populating Acception Lexical Database through Bilingual Dictionaries and Conceptual Vectors. PAPILLON 2002 Mathieu Lafourcade LIRMM - France. Overwiew & Objectives. Lexical soup what ? Bilingual dic & Conceptual vectors which heuristics ?
E N D
Automatically Populating Acception Lexical Database through Bilingual Dictionaries and Conceptual Vectors PAPILLON 2002 Mathieu Lafourcade LIRMM - France
Overwiew & Objectives Lexical soup what ? Bilingual dic & Conceptual vectors which heuristics ? for what ? linking decision and quality assessment
Conceptual vectorsvector space • An idea Concept combination — a vector • Idea space = vector space • A concept = an idea = a vector V with augmentation: V + neighboorhood • Meaning space = vector space + {v}*
Conceptual vectors Thesaurus • H : thesaurus hierarchy — K concepts Thesaurus Larousse = 873 concepts • V(Ci) : <a1, …, ai, … , a873> aj = 1/ (2 ** Dum(H, i, j)) 1/16 1/16 1/4 1 1/4 1/4 1/64 1/64 4 2 6
Conceptual vectors Conceptc4:peace peace conflict relations hiérarchical relations society The world, manhood
Conceptual vectors Term “peace” c4:peace
Angular distance • DA(x, y) = angle (x, y) • 0 DA(x, y) • if 0 then x & y colinear — same idea • if /2 then nothing in common • if then DA(x, -x) with -x — anti-idea of x x’ x y
Angular distance DA(x, y) = acos(sim(x,y)) DA(x, y) = acos(x.y/|x||y|)) DA(x, x) = 0 DA(x, y) = DA(y, x) DA(x, y) + DA(y, z) DA(x, z) DA(0, 0) = 0 and DA(x, 0) = /2 by definition DA(x, y) = DA(x, y) with 0 DA(x, y) = - DA(x, y) with < 0 DA(x+x, x+y) = DA(x, x+y) DA(x, y)
Thematic distance • Examples • DA(tit, tit) = 0 • DA(tit, passerine) = 0.4 • DA(tit, bird) = 0.7 • DA(tit, train) = 1.14 • DA(tit, insect) = 0.62 tit = insectivorous passerine bird …
Conceptual vector base v v v v v v English entry base French entry base fleuve river rivière acc y map acc carte.1 carte.2 card.1 acc x Malay entry base Japanese entry base acc z sungai acc x Acception base
Conceptual vector base v v v v v v French entry base v fleuve v fleuve v rivière v rivière acc x v v carte.2 carte.2 v carte.1 v carte.1 acc y acc x empty acc z Acception base
equivalents equivalents equivalents equivalents equivalents glosses glosses glosses glosses glosses demand.1 demand.2 demand.5 demand.3 demand.4 demand.1 demand.1 demand.1 demand.2 demand.1 demand.3 demand.1 demand.4 def def def def def def def def v v v v v v v v English Vectorized monolingualdictionary English-French Bilingualdictionary Left over meaning demand Association v demand v v v v Vector space Equivalents 2 Glosses 2 Equivalents 2 Glosses 2 Equivalents 2 Glosses 2 Equivalents 2 Glosses 2 Associations between definitions, vectors, glosses and equivalents
Source Language Target Language Acceptions already existing link created link W-SL i equiv = {W-TL} W-TL equiv = {W-SL, …} warning if not close vectors are close W-SL i {W-TL} v {W-SL, …} W-TL v W-TL {W-SL, …} v {W-SL, …} W-TL v left over acceptions W-TL {…}
Source Language Acceptions Target Language W-TL1 Equiv = {W-SL, …} v v W-TL1 Equiv = {W-SL, …} v W-TL1 Equiv = {W-SL, …} W-TL1 Equiv = {…} W-SL i v {W-TL1,W-TL2, …} W-TL2 Equiv = {W-SL, …} v v W-TL2 Equiv = {W-SL, …} v W-TL2 Equiv = {W-SL, …} W-TL2 Equiv = {…} …
equiv = W-TL W-SL i createdacception equiv = W-TL W-SL i refinement links equiv = W-TL W-SL i equiv = W-TL W-SL i closest vector equiv = W-TL W-SL i equiv = W-TL W-SL i
equiv = W-TL W-SL i equiv = W-TL W-SL j equiv = W-TL W-SL j 1 equiv = W-TL W-SL i equiv = W-TL W-SL j equiv = W-TL W-SL j 2 equiv = W-TL W-SL i equiv = W-TL W-SL j equiv = W-TL W-SL j closest vector
Conclusion • System in continuous learning • Evolving results • Hopefully converging • Assisting and begin assisted by • Vectorized lexical functions • Human annotators • Toward • Community of lexical agents • Lexical knowledge negotiation