Relation extraction and the influence of automatic named-entity recognition

Relation extraction and the influence of automatic named-entity recognition Presenter : Shao-Wei Cheng Authors : CLAUDIO GIULIANO, ALBERTO LAVELLI, and LORENZA ROMANO TSLP 2007

Outline • Motivation • Objective • Methodology • Named-entity recognition • Kernel Methods for Relation Extraction • Experiments • Conclusion • Personal Comments

Motivation • Information extraction aims at extracting structured information from unstructured or semi-structured textual documents. • As a matter of fact, NER performance is far from perfect, and its influence on relation-extraction performance is still an area of investigation. • Named Entity Recognition • Relation Extraction 3

Objectives • The authors present an approach for extracting relations between named entities from natural language documents. • Evaluated the effect of automatic named-entity recognition on a novel approach to relation extraction. • Relation Extraction • Named Entity Recognition If the relation held, then it is labeled 1, otherwise, it is labeled -1.

Methodology • Named Entity Recognition • Method：CRFs are provided in MALLET. • Processing • (a) the word itself, • (b) the PoS tag of the token, • (c) orthographic predicates • (d) gazetteers of locations, people names and organizations, • (e) character-n-gram predicates for 2 ≦ n ≦ 3. • MO：Corrected entities • MC：Entity boundaries known, but classification not. • MR&C：Entity boundaries and classification aren’t known. “The [New Deal]LOC describes the program of US president Franklin [D. Roosevelt]PER” 5

Methodology • Relation Extraction • Method：SVM. • Kernel methods： • KGC：Global Context Kernel • KLC ：Local Context Kernel • KSL ：Shallow Linguistic Kernel

Experiments • Dataset • From the papers of Roth and Yih • Evaluation • Cross-validation：Precision, Recall and F-measure • Statistical significance：approximate randomization. • Confidence interval：percentile bootstrap. • The effectiveness of the kernel method. • The influence of the noise. • Compare this approach against the method proposed in Roth and Yih. 7

Experiments • The effectiveness of the kernel method. • Relation extraction training and testing by the correct entities. • Testing by MC • Training by the correct entities. • * Training by the MC. • Testing by MR&C • Training by the correct entities. • * Training by the MR&C. 8

Training by the MO Training by the MR&C Experiments • The influence of the noise. 9

Experiments • Compare this approach against the method proposed in Roth and Yih. • The entities are correctly identified. • The entity boundaries are known. 10

Conclusion • The method has already demonstrated state-of-the-art performance when applied in the extraction of protein-protein interactions from biomedical literature. • The experiments reported that applied to the newswire domain, the combined kernel is still consistently superior, mainly in term of precision, to its basic parts and that it significantly outperforms previously proposed approaches even in presence of noise introduced by an automatic entity tagger. • Evaluate the contribution of syntactic information to relation extraction. • Extend the application of the proposed methodology to a different and wider set of relations. • The possibility of reducing the dimension of the training set using unsupervised technique.

Personal Comments • Advantage • … • Drawback • … • Application • Relation extraction • Named-entity recognition

Relation extraction and the influence of automatic named-entity recognition

Relation extraction and the influence of automatic named-entity recognition

Presentation Transcript

Named Entity Recognition

Named Entity Recognition

CS544: Named Entity Recognition and Classification

Information Extraction Lecture 3 – Rule-based Named Entity Recognition

Information Extraction Lecture 5 – Named Entity Recognition III

Biomedical Named Entity Recognition

Information Extraction Lecture 4 – Named Entity Recognition II

Named Entity Recognition

Improving Machine Translation Quality with Automatic Named Entity Recognition

Named Entity Recognition

Information Extraction Lecture 5 – Named Entity Recognition III

Information Extraction Lecture 4 – Named Entity Recognition II

NAMED ENTITY RECOGNITION

Named Entity Extraction

Named Entity Recognition

CS544: Named Entity Recognition and Classification

Named Entity Recognition and the Stanford NER Software

How Does Named Entity Recognition Work?