160 likes | 261 Views
Semantic Addressable Encoding. Cheng-Yuan Liou, Jau-Chi Huang, and Wen-Chie Yang Department of Computer Science and Information Engineering National Taiwan University TC402, Oct. 5, ICONIP 2006, Hong Kong. Web red.csie.ntu.edu.tw. Sentence generating function
E N D
Semantic Addressable Encoding Cheng-Yuan Liou, Jau-Chi Huang, and Wen-Chie Yang Department of Computer Science and Information Engineering National Taiwan University TC402, Oct. 5, ICONIP 2006, Hong Kong
Web red.csie.ntu.edu.tw • Sentence generating function • The semantic world of Mark Twain • Semantic Search under Shakespeare
Outline • Introduction • Encoding Method • Elman network • The word corpus – Elman’s idea • Review semantic search • Multidimensional Scaling (MDS) space • Representative vector of a document • Iterative re-encoding • Example • Summary
Introduction • A central problem in semantic analysis is to effectively encoding and extracting the contents of word sequences. • Traditional way of creating a prime semantic space is extremely expensive and complex because experienced linguists are required to analyze huge number of words. • This paper represents an automatic encoding process.
Uoh: Lh x Lo weight matrix Uhi: Li x Lh weight matrix Uhc: Lc x Lh weight matrix Ll = # neurons in output Lh = # neurons in hidden Li = # neurons in input Lc = # neurons in context The context layer carries memory The hidden layer activates output layer and refreshes context layer Desired behavior after training process Elman Network
The word corpus – Elman’s idea • All words are coded with certain given lexical codes and all word sequences in corpus D follow the syntax (Noun + Verb + Noun). • After training, input all sequences again and record all hidden outputs for each individual input. • Obtain new code for nth word by averaging all vectors in • Construct a word tree based on the new codes to explore the relationship between words.
Review semantic search • The conventional semantic search constructs a semantic model and a semantic measure. • A manually designed semantic code set by experts is used in the model. (main focus) • One can build a raw semantic matrix W for all N different words • A code of a word is a column vector of R features • One may use the orthogonal space configured by the characteristic decomposition of the matrix, WWT.
The semantic search • Since WWT is a symmetric matrix, all its eigenvalues are real and nonnegative numbers. • Each eigenvalue λi equals the variance of the N projections of the codes on the ith eigenvector, fi, that is,
Multidimensional Scaling (MDS) space • Select a set of Rs eigenvectors {fr, r=1~Rs} from all R eigenvectors to build a reduced feature space • The MDS space is MDS = span{Fs} • These selected features are independent and significant. The new code of each word in this space is
Representative vector of a document • A representative vector for a document D should contain the semantic meaning of the whole document. • Two measures are defined • Peak preferred measure • Average preferred measure • Magnitude is normalized
Representative vector of a document • The normalized measure vD is used to represent a whole document. And a representative vector vQ for a whole query can be obtained by the same way. • The relation score is defined as
Iterative re-encoding • Eleman’s method for sentence generation of fixed syntax Noun+Verb+Noun can not be applied to more complex sentences. • We modify his method. Each word has random lexical code initially • After the jth training epoch, a new raw code is calculated
Iterative re-encoding • The set sn contains all prediction for the word wn based on its precedent words. • After each epoch, all the codes are normalized by the following two equations. The normalization prevents a diminished solution derived by the backpropgation algorithm.
Example • Test the ability of classifying 36 Shakespeare’s plays. We consider each play as the query input and calculate the relation score of this and one other play. The figure below shows the relation tree. c: comedy r: romance h: history t: tragedy Number denotes publication year Model parameters: Di=1…36, Qi=1…36, N=10000, Lh=Lc=200, Lo=Li=Rs=R=64
Example • We provide a semantic search tool using corpus from Shakespeare’s comedies and tragedies at http://red.csie.ntu.edu.tw/demo/literal/SAS.htm • Example search result with parameters Di=1…7777, N=10000, Lo=Li=R=100, Lh=Lc=200, Rs=64
Summary • We have explored the concept of semantic addressable encoding and completed a design for it that includes automatic encoding methods. • We have presented the result of applying this method in studying literary works. • The trained semantic codes can facilitate other research such as linguistic analysis, authorship identity, categorization, etc. • The method can be modified to accommodate polysemous words.