300 likes | 392 Views
Presenter : Bo- Sheng Wang Authors : Majid Yazdani a,b ,* , Andrei Popescu-Belis a AI, 2013. Computing text semantic relatedness using the contents and links of a hypertext encyclopedia. Outlines. Motivation Objectives Methodology Em pirical analyses Experiments Conclusions
E N D
Presenter : Bo-Sheng Wang Authors : MajidYazdania,b,*, Andrei Popescu-Belisa AI, 2013 Computing text semantic relatedness using the contents and links of a hypertext encyclopedia
Outlines Motivation Objectives Methodology Empirical analyses Experiments Conclusions Comments
Motivation • Existing measures of semantic relatedness based on lexicaloverlap, though widely used, are of little help when text similarity is not based on identicalwords.
Objectives Therefore, they will computing text semantic relatedness based on concepts and their relations, which have linguistic as well as extra-linguistic dimensions, remains a challenge especially in the general domain and/or over noisy
Methodology-build concept network • Concept • They removed all Wikipedia articles. • (Talk,File, Image, Template, Category, Portal, and List,) • Disambiguation pages were removed. • They set a cut-off limit of 100 non-stop words. • They extracted the corresponding anchor text and considered it as another possible secondary title for the linked article.
Methodology-build concept network • Relatoins • They focus in the present study on the hyperlinks and links computed from similarity of content, of category. • we computed the lexical similarity between articles as the cosine similarity between the vectors derived from the articles’ texts, after stopword removal and stemming using Snowball.
Methodology-Approximation • T–truncated • ε-truncated
Empirical analyses Convergence of the T-truncated
Empirical analyses Convergence of ε-truncated
Experiments Average training error
Experiments Average training error
Experiments Word Similarity
Experiments Word Similarity
Experiments Document similarity
Experiments Document clustering
Experiments Comparison of VP and cosine similarity
Experiments Text classification
Comments • Advantages • Disadvantage • Applications • Text categorization