Learning Effective Patterns for Information Extraction

Learning Effective Patterns for Information Extraction Gijs Geleijnse (gijs.geleijnse@philips.com)

Overview • my view on Ontology Population / Information Extraction • short discussion of global approach wrt Ontology Population • a subproblem: learning relation patterns • experiments with learned patterns • conclusions

What’s the problem? Information is freely accessible on the web ... but the information on the `traditional’ web is not interpretable by machines. Goal of my research: Find, extract and combine information on the web into a machine interpretable format

What’s the problem? (2) • Come up with a model for the concept information • 2. Come up with algorithms to populate this model  Ontologies

What’s an ontology?

Populating an ontology • 1. Formulate queries with instance • Ù2’s album’. • 2. Collect Google search results: • Ù2’s album Pop ..’, • Ù2’s album on a flash card’, • Ù2’s album How to Dismantle..’ • Identify instances Ù2’ producer album artist (Boy), (HtDaAB)... (U2, Boy), (U2, HtDaAB)

Subproblems of OP How to • identify patterns expressing relations? Amsterdam – Netherlands `is the capital of’ • identify instances in the Googled texts? `buy i still know what you did last summer on dvd` • define acceptance functions for instances and relations `they think Amsterdam is the capital of Germany hahahaha’

Identifying effective relation patterns • We want patterns that give many useful results. Three criteria for effectiveness: • A pattern must frequently occur on the web i.e. it must return many results. • A pattern must be precise, i.e. it must return many useful results. • 3. When relation R is one-to-many, a pattern must be wide-spread, i.e. it must return diverse results.

Identifying effective relation patterns • Approach: • Compose training set with related items. • Google them to get a set of patterns • Compute scores for the patterns • Constraint: • - Don’t Google too often!

Retrieving relation patterns We formulate queries with the elements in the trainingset: “Michael Jackson * Thriller”, “Thriller * Michael Jackson” We retrieve all inner-sentence fragments between the instances and normalize them (remove punctuation marks and capitals).

Evaluate relation patterns We now have a (long) list of patterns: [album] by [artist] ; [artist]’s [album] ; [album] album cover by [artist] ; [album] di [artist] ; ......... Now to compute scores: frequency, precision, wide-spreadness

Evaluate relation patterns • Frequency: we take the frequency of the pattern in the list obtained. • Precision: • - we google the pattern in combination with an instance • observe the fraction of useful results • e.g. if we google “ABBA’s new album” we divide the number excerpts with an album title by the total number of excerpts found

Evaluate relation patterns Wide-spreadness: we count the number of different instances found with the query. Score= freq * prec * spr We only compute the scores of the N most frequent patterns. Number of queries: 2 * |training set| + N * |instance set|

Case-study: Hearst Patterns Are the Hearst Patterns indeed the most effective patterns for the is-a relation? (O = ((country, hynonym), ({all countries}, {‘country’, ‘countries’}), is_a, {(Afghanistan,country), (Afghanistan, countries), (Akrotiri, country), (Akrotiri, countries), ...)})

Case-study: Hearst Patterns Both the common Hearst Patterns and relations typical for this setting (countries) perform well.

Case-study: Burger King TREC QA question: In which countries can Burger King be found? O = ((country, restaurant), ({all countries}, {McDonald’s, KFC}), located_in, (McDonald’s, USA), (KFC, China), ...))

Case-study: Burger King We first find patterns using the method described:

Case-study: Burger King • ... And simultaneously find names of restaurants • Capitalized words • Noh(“restaurants like X and”) >= 50

Case-study: Burger King • Finally, we use the patterns found in combination with `Burger King` to find relations. • - Precision 80% • Recall 85% • Most errors due to countries in which Burger King plans to open restaurants.

Conclusions • Automatic Pattern selecting is successful • Simple methods again lead to good results • Recognition of instances and the filtering of erroneous patterns is still a big chalenge • Ontology Population is fun

Learning Effective Patterns for Information Extraction

Learning Effective Patterns for Information Extraction

Presentation Transcript

Automatic Acquisition of Lexical Classes and Extraction Patterns for Information Extraction

Learning Hidden Markov Model Structure for Information Extraction

Information Extraction

Information Extraction

Information Extraction

information extraction

Lexico -semantic Patterns for Information Extraction from Text

Information Extraction

Text Learning and Information Extraction

Information Extraction

Automatic Discovery of Scenario-Level Patterns for Information Extraction

Learning for Biomedical Information Extraction with ILP

Information Extraction

Information Extraction

Information Extraction

Information Extraction

Coupled Semi-Supervised Learning for Information Extraction

Information Extraction

Machine Learning for Information Extraction

ONDUX On-Demand Unsupervised Learning for Information Extraction