1 / 21

Scaling Textual Inference to the Web

Scaling Textual Inference to the Web. Stefan Schoenmackers , Oren Etzioni , and Daniel S. Weld Presented by Kristine Monteith CS 652 - 5/8/09. The Problem. Lots of information on the web, but answers to questions aren’t always stated explicitly

carlyn
Download Presentation

Scaling Textual Inference to the Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scaling Textual Inference to the Web Stefan Schoenmackers, Oren Etzioni, and Daniel S. Weld Presented by Kristine Monteith CS 652 - 5/8/09

  2. The Problem • Lots of information on the web, but answers to questions aren’t always stated explicitly • Query: “What vegetables help prevent osteoporosis?” • Not going to find “Kale prevents osteoporosis” • Need to infer this from: • kale is a vegetable • kale contains calcium • calcium helps prevent osteoporosis

  3. Overview • HOLMES Architecture (performs textual inference) • Scaling Inference to the Web • Experimental Results • Related Work

  4. The HOLMES Architecture Information from Knowledge Bases e.g. IsHighIn(kale, calcium), Prevents(calcium, osteoporosis) Inference Rules e.g. Prevents(X,Z) :- IsHighIn(X,Z) ^ Prevents(Y,Z) Queries e.g. query(X) :- IS-A(X,vegetable) ^ Prevents(X,osteoporosis)

  5. Partial proof tree (DAG) for the query “What vegetables help prevent osteoporosis?”

  6. Incremental Expansion • Exact probabilistic inference is NP-complete • To deal with this, HOLMES • Uses approximate methods (loopy belief propagation) • Focused queries help keep probabilistic inference manageable • Creates networks incrementally (searches for additional proof trees and updates the network if there is more time) • Exploits standard Datalog optimization (e.g. only expands proofs of recently added nodes)

  7. Markov Logic Inference Rules 1. Observed relations are likely to be true: • R(X,Y) :- ObservedInCorpus(X, R, Y) 2. Synonym substitution preserves meaning: • RTR(X’,Y) :- RTR(X,Y) ^ Synonym(X, X’) • RTR(X,Y’) :- RTR(X,Y) ^ Synonym(Y, Y’) 3. Generalizations preserve meaning: • RTR(X’,Y) :- RTR(X,Y) ^ IS-A(X, X’) • RTR(X,Y’) :- RTR(X,Y) ^ IS-A(Y, Y’) 4. Transitivity of Part Meronyms: • RTR(X,Y’) :- RTR(X,Y) ^ Part-Of(Y, Y’) where RTR matches ‘* in’ (e.g., ‘born in’).

  8. Scaling Inference to the Web • In order to scale Textual Inference to the web, it has to scale linearly • Assumptions: • Number of ground assertions |A| grows linearly with size of corpus (True for assertions extracted by TextRunner) • Size of every proof tree is bounded by some constant m (Seems to be true in practice, could be enforced by terminating search for proof trees at a certain depth) • Need to show that constructing proof trees takes O(|A|) time

  9. Constructing proof trees in O(|A|) time • Using function free horn clauses means that logical inference can be done in polynomial time • Still not good enough to scale to the Web • Need to ensure two more things: • Number of different types of proofs doesn’t grow too quickly (e.g. Fixed number of rules results in a constant number of first-order search trees) • Number of tuples participating in each relation doesn’t grow too quickly

  10. Approximately Pseudo-Functional

  11. Experimental Results • Uses two knowledge bases: • TextRunner (183 million ground assertions from 117 million web pages) • WordNet (159 thousand manually created IS-A, Part-Of, and Synonym assertions) • Twenty queries in three domains • Geography • Business • Nutrition

  12. Geography Queries • “Who was born in one of the following countries?” • Q(X) :- BornIn(X,{country}) • Possible countries: France, Germany, China, Thailand, Kenya, Morocco, Peru, Columbia, Guatemala • Example: • Ground assertion:BornIn(Alberto Fujimori, Lima) • Background knowledge: LocatedIn(Lima, Peru) • New conclusion: BornIn(Alberto Fujimori, Peru)

  13. Business Queries • Which companies are acquiring software companies? • Q(X) :- Acquired(X, Y)^ Develops(Y, ‘software’) • This query tests HOLMES’s ability to scalably join a large number of assertions from multiple pages. • Which companies are headquartered in the USA? • Q(X) :- HeadquarteredIn(X, ‘USA’) ^ IS-A(X, ‘company’) • Join on HeadquarteredIn and IS-A • Transitive inference: • Seattle is PartOf Washington which is PartOfthe USA • Microsoft IS-A software company which IS-A company

  14. Nutrition Queries • “What foods prevent disease?” • Q(X, {disease}) :- Prevents(X, {disease}) ^ IS-A(X, {food}) • Possible foods: fruit, vegetable, grain • Possible diseases: anemia, scurvy, or osteoporosis.

  15. Effect of Inference on Recall • Baseline: Number of query answers derived from information explicitly stated in the Knowledge Bases (TextRunner and WordNet) • Inference increases the number of query answers by 102% for the Geography domain, and considerable more for the other two domains

  16. Prevalence of APF Relations • Examined 500 binary relations selected randomly from TextRunners assertions • Largest two relations had over 1.25 million unique instances • 52% of the relations had more than 10,000 instances • Found most of the smallest value Kmin such that the relation was APF with degree Kmin • 80% of relations were APF with degree less than 496

  17. Related Work • Van Durme and Schubert (2008) • Use highly expressive representations (e.g. negation, temporal information) • HOLMES is less expressive but more scalable • Open-domain Question-Answering Systems • Attempt to find individual documents or sentences containing the answer • HOLMES can infer from multiple texts, but is not well suited to answering more abstract or open-ended questions • Statistical Relational Learning • Techniques for combining logical and probabilistic inference • HOLMES uses more restrictive inference rules, but again is more scalable

  18. Conclusions 1. We introduce and evaluate the HOLMES system, which leverages KBMC methods in order to scale a class of TI methods to the Web. 2. We define the notion of Approximately Pseudo-Functional (APF) relations and prove that, for a APF relations, HOLMES’s inference time increases linearly with the size of the input corpus. We show empirically that APF relations appear to be prevalent in our Web corpus and that HOLMES’s runtime does scale linearly with the size of its input taking only a few CPU minutes when run over 183 million distinct ground assertions. 3. We present experiments demonstrating that, for a set of queries in the domains of geography, business, and nutrition, HOLMES substantially improves the quality of answers (measured by AuC) relative to a “no inference” baseline.

  19. Questions???

More Related