INTRODUCTION TO ARTIFICIAL INTELLIGENCE

INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo PoesioSupervised Relation Extraction

RE AS A CLASSIFICATION TASK • Binary relations • Entities already manually/automatically recognized • Examples are generated for all sentences with at least 2 entities • Number of examples generated per sentence isNC2 – Combination of N distinct entities selected 2 at a time

GENERATING CANDIDATES TO CLASSIFY

RE AS A BINARY CLASSIFICATION TASK

NUMBER OF CANDIDATES TO CLASSIFY – SIMPLE MINDED VERSION

THE SUPERVISED APPROACH TO RE • Most current approaches to RE are kernel-based • Different information is used • Sequences of words, e.g., through the GLOBAL CONTEXT / LOCAL CONTEXT kernels of Bunescu and Mooney / GiulianoLavelli & Romano • Syntactic information through the TREE KERNELS of Zelenko et al / Moschitti et al • Semantic information in recent work

KERNEL METHODS: A REMINDER • Embedding the input data in a feature space • Using a linear algorithm for discovering non-linear patterns • Coordinates of images are not needed, only pairwise inner products • Pairwiseinner products can be efficiently computed directly from X using a kernel function K:X×X→R

MODULARITY OF KERNEL METHODS

THE WORD-SEQUENCE APPROACH • Shallow linguistic Information: • tokenization • Lemmatization • sentence splitting • PoStagging Claudio Giuliano, Alberto Lavelli, and Lorenza Romano (2007), FBK-IRST: Kernel methods for relation extraction, Proc. Of SEMEVAL-2007

LINGUISTIC REALIZATION OF RELATIONS Bunescu & Mooney, NIPS 2005

WORD-SEQUENCE KERNELS • Two families of “basic” kernels • Global Context • Local Context • Linear combination of kernels • Explicit computation • Extremely sparse input representation

THE GLOBAL CONTEXT KERNEL

THE LOCAL CONTEXT KERNEL

LOCAL CONTEXT KERNEL (2)

KERNEL COMBINATION

EXPERIMENTAL RESULTS • Biomedical data sets • AIMed • LLL • Newspaper articles • Roth and Yih • SEMEVAL 2007

EVALUATION METHODOLOGIES

EVALUATION (2)

EVALUATION (3)

EVALUATION (4)

RESULTS ON AIMED

OTHER APPROACHES TO RE • Using syntactic information • Using lexical features

Syntactic information for RE • Pros: • more structured information useful when dealing with long-distance relations • Cons: • not always robust • (and not available for all languages)

Zelenko et al JMLR 2003 • TREE KERNEL defined over a shallow parse tree representation of the sentences • approach vulnerable to unrecoverable parsing errors • data set: 200 news articles (not publicly available) • two types of relations : person-affiliation and organization-location

ZELENKO ET AL

CULOTTA & SORENSEN 2004 • generalized version of Zelenko’s kernel based on dependency trees (smallest dependency tree containing the two entities of the relation) • a bag-of-words kernel is used to compensate syntactic errors • data set: ACE 2002 & 2003 • results: syntactic information improves performance w.r.t. bag-of-words (good precision but low recall)

CULOTTA AND SORENSEN (2)

EVALUATION CAMPAIGNS FOR RE • Much of modern evaluation of methods is done by competing with other teams on evaluation campaigns like MUC and ACE • Modern evaluation campaigns for RE: SEMEVAL (now *SEM) • Interesting to look also at the problems of • DATA CREATION • EVALUATION METRICS

SEMEVAL 2007 • 4th International Workshop on Semantic Evaluations • Task 04: Classification of Semantic Relations between Nominals • organizers: Roxana Girju, Marti Hearst, PreslavNakov, ViviNastase, Stan Szpakowicz, Peter Turney, DenizYuret • 14 participating teams

SEMEVAL 2007: THE RELATIONS

SEMEVAL 2007: DATASET CREATION

SEMEVAL 2007: DATASET CREATION (2)

SEMEVAL 2007 – DATASET CREATION (3)

SEMEVAL 2007 – DATASET CREATION (4)

SEMEVAL 2007: DATASET

SEMEVAL 2007: COMPETITION

SEMEVAL 2007: COMPETITION (2)

SEMEVAL 2007: BEST RESULTS

INFLUENCE OF NER ON RE

INFLUENCE OF NER ON RE (2)

GENERATING CANDIDATES

ACKNOWLEDGMENTS • Many slides borrowed from • Roxana Girju • Alberto Lavelli

INTRODUCTION TO ARTIFICIAL INTELLIGENCE