1 / 76

YAGO: Yet Another Great Ontology

YAGO: Yet Another Great Ontology. PhD Defense Fabian M. Suchanek (Max-Planck Institute for Informatics, Saarbr ü cken ) ‏. Overview. Motivation: Why would anybody need Ontologies? Building a Core Ontology: YAGO Extending the Core Ontology: SOFIE. Santa Claus in Need.

mtipton
Download Presentation

YAGO: Yet Another Great Ontology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. YAGO: Yet Another Great Ontology PhD Defense Fabian M. Suchanek (Max-Planck Institute for Informatics, Saarbrücken)‏ YAGO - A Core of Semantic Knowledge

  2. Overview • Motivation: Why would anybody need Ontologies? • Building a Core Ontology: YAGO • Extending the Core Ontology: SOFIE YAGO - A Core of Semantic Knowledge

  3. YAGO - A Core of Semantic Knowledge Santa Claus in Need World population

  4. The Search for a Second Santa Claus strong, tall guy , australian Seeking strong, tall Australian man I'm 27, blue eyes, looking for a tall strongAustralian man. girls-seek-guys.com/london/42CachedSimilar pages YAGO - A Core of Semantic Knowledge

  5. The Search for a Second Santa Claus strong person, > 1.90, Australian Seeking strong, tall Australian man I'm 27, blue eyes, looking for a tall strongAustralian man. ... I'm 190 kg girls-seek-guys.com/london/42CachedSimilar pages YAGO - A Core of Semantic Knowledge

  6. The Search for a Second Santa Claus Hi Larry, it's me, Santa Claus. I think you misunderstood wh Seeking strong, tall Australian man I'm 27, blue eyes, looking for a tall strongAustralian man. girls-seek-guys.com/london/42CachedSimilar pages YAGO - A Core of Semantic Knowledge

  7. Solution: An Ontology physical entity is a person is a is a continent is a isFrom height Australia 1.90m YAGO - A Core of Semantic Knowledge

  8. Solution: An Ontology physical entity is a Classes person is a Relations is a continent is a isFrom Individuals Australia YAGO - A Core of Semantic Knowledge

  9. Vision Gathering the knowledge of this world in a structured ontology. رSemantic Search رQuestion answering رMachine Translation رDocument classification ر… The world, I‘d like to say, even though some may contradict, is not as it seems. It rather seems as if the world seems not what it seems YAGO - A Core of Semantic Knowledge

  10. Plan of Attack • Motivation  • Building a Core Ontology: YAGO • Extending the Core Ontology: SOFIE The world, I‘d like to say, even though some may contradict, is not as it seems. It rather seems as if the world seems not what it seems YAGO - A Core of Semantic Knowledge

  11. YAGO: Goal Goal: Build a Large Ontology Previous Approaches: رAssemble the ontology manually (WordNet, SUMO, Cyc, GeneOntology)‏ Problem: Usually low coverage (MPI is in none of these)‏ ر Use community work (Semantic Wikipedia, Freebase)‏ Problem: We don't know yet whether it takes off YAGO - A Core of Semantic Knowledge

  12. YAGO: Goal Goal: Build a Large Ontology Our Approach: رExtract knowledge from Wikipedia and WordNet (securing high coverage) ر Use extensive quality control techniques (securing high consistency) YAGO - A Core of Semantic Knowledge

  13. YAGO: Infoboxes Claus K bornIn Sydney blah blah blub (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter Exploit infoboxes Born in: Sydney ... YAGO - A Core of Semantic Knowledge

  14. YAGO: Categories Claus K bornIn born Sydney blah blah blub (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter 1980 Exploit infoboxes Exploit relational categories Categories: 1980_births YAGO - A Core of Semantic Knowledge

  15. YAGO: Categories Australian Boxer Claus K isA bornIn born Sydney blah blah blub (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter 1980 Exploit infoboxes Exploit relational categories Categories: Exploit conceptual categories Australian Boxers YAGO - A Core of Semantic Knowledge

  16. YAGO: Categories Kick boxing Australian Boxer Claus K isA isA bornIn born Sydney blah blah blub (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter 1980 Exploit infoboxes Exploit relational categories Categories: Exploit conceptual categories Kick boxing Avoid thematic categories YAGO - A Core of Semantic Knowledge

  17. YAGO: Upper Model entity ? person Australian boxer is a born 1980 YAGO - A Core of Semantic Knowledge

  18. YAGO: Upper Model Business Social_group ? People_by_occupation Australian boxer is a born 1980 YAGO - A Core of Semantic Knowledge

  19. YAGO: Upper Model Person subclass WordNet Boxer subclass Australian boxer is a Wikipedia born 1980 [Suchanek et al.: WWW 2007] YAGO - A Core of Semantic Knowledge

  20. YAGO: Quality Control 1. Canonicalization 1. ... of entities Santa Klaus Santa Clause Santa Claus Santa YAGO - A Core of Semantic Knowledge

  21. YAGO: Quality Control 1. Canonicalization 1. ... of entities YAGO - A Core of Semantic Knowledge

  22. YAGO: Quality Control 1. Canonicalization 1. ... of entities 2. ... of facts born 1980 born 1980-12-19 YAGO - A Core of Semantic Knowledge

  23. YAGO: Quality Control 1. Canonicalization 1. ... of entities 2. ... of facts 2. Type Checks 1. Reductive Type Checking range(bornOnDate, timepoint)‏ bornOnDate(Claus_Kent, Sydney)‏ YAGO - A Core of Semantic Knowledge

  24. YAGO: Quality Control Entity 1. Canonicalization 1. ... of entities 2. ... of facts 2. Type Checks 1. Reductive Type Checking 2. Type Coherence Checking Person Artifact Boxer, Swimmer, Flight instructor, Airplane YAGO - A Core of Semantic Knowledge

  25. YAGO: Quality Control 1. Canonicalization 1. ... of entities 2. ... of facts 2. Type Checks 1. Reductive Type Checking 2. Type Coherence Checking Every fact and every entity occurs exactly once Every fact fulfills its type constraints [Suchanek et al.: JWS 2008] YAGO - A Core of Semantic Knowledge

  26. YAGO: Numbers bornIn, actedIn, hasInflation,... Relations: 100 Entities: 2 million Facts: 19 million Accuracy: 95% One of the largest public free ontologies Unprecedented quality among automatedly constructed ontologies YAGO - A Core of Semantic Knowledge

  27. YAGO: Model boxer #1 (ClausKent,is_a,boxer)‏ #2 (#1, since, 1990)‏ #3 (#1, source, Wikipedia)‏ since 1990 is a source Wikipedia YAGO - A Core of Semantic Knowledge

  28. YAGO: Model • A YAGO ontology over • a set of relations R • a set of common entities C • a set of fact identifiers I • is a function I  (RCI)  R  (RIC)‏ #1 (ClausKent,is_a,boxer)‏ #2 (#1, since, 1990)‏ #3 (#1, source, Wikipedia)‏ • We can talk about • facts (#1, source, Wikipedia)‏ • additional arguments (#1, since, 1990)‏ • relations (since, hasRange, time_interval)‏ Still: Decideable Consistency YAGO - A Core of Semantic Knowledge

  29. YAGO: Summary YAGO is an ontology that is رlarge (combining Wikipedia and WordNet) رaccurate (using extensive quality control) رcomputationally tractable (with a decideable consistency) YAGO - A Core of Semantic Knowledge

  30. Plan of Attack • Motivation  • Building a Core Ontology: YAGO  • Extending the Core Ontology: SOFIE YAGO The world, I‘d like to say, even though some may contradict, is not as it seems. It rather seems as if the world seems not what it seems YAGO - A Core of Semantic Knowledge

  31. SOFIE: Goal Statement bornIn Patara Saint Nicholas Goal: Extending the ontology Saint Nicholas was born in Patara. YAGO - A Core of Semantic Knowledge

  32. SOFIE: Goal Statement bornIn Patara Saint Nicholas Goal: Extending the ontology Saint Nicholas ce e poдuлвPatara. YAGO - A Core of Semantic Knowledge

  33. SOFIE: Goal Statement bornIn Patara Saint Nicholas Goal: Extending the ontology recoverWithout(most_people, medication)‏ areUnder(0%, the_age_of_18)‏ support(these_findings, the_notion)‏ Saint Nicholas was born in Patara. Previous Approaches: ر Extract knowledge from corpora (e.g. the Web)‏ (Text2Onto, Espresso, Snowball, TextRunner)‏ Problems: Low accuracy, non-canonicity YAGO - A Core of Semantic Knowledge

  34. SOFIE: Goal Statement bornIn Patara Saint Nicholas Goal: Extending the ontology Saint Nicholas was born in Patara. Our Approach (1): رLEILA - Combining Linguistic and Statistical Analysis [Suchanek et al.: KDD 2006] Has high accuracy, but does not deliver canonicity YAGO - A Core of Semantic Knowledge

  35. SOFIE: Goal Statement bornIn Patara Saint Nicholas Goal: Extending the ontology Saint Nicholas was born in Patara. Our Approach (2): ر SOFIE: Use logical reasoning to guarantee canonicity YAGO - A Core of Semantic Knowledge

  36. SOFIE: Example YAGO ~ Worshipped People ~ bornInYear 1935 Saint Nicholas was born in the year 1417. Elvis Presley was born in the year 1935. "was born in the year" expresses bornInYear Pattern occurrence ~~> pattern meaning YAGO - A Core of Semantic Knowledge

  37. SOFIE: Example YAGO ~ Worshipped People ~ bornInYear 1935 Saint Nicholas was born in the year 1417. Elvis Presley was born in the year 1935. "was born in the year" expresses bornInYear Pattern occurrence ~~> pattern meaning Pattern occurrence ~~> sentence meaning bornInYear 1417 YAGO - A Core of Semantic Knowledge

  38. SOFIE: Example YAGO ~ Worshipped People ~ bornInYear 1935 Saint Nicholas was born in the year 1417. diedInYear Elvis Presley was born in the year 1935. 347 "was born in the year" expresses bornInYear Pattern occurrence ~~> pattern meaning Pattern occurrence ~~> sentence meaning bornInYear 1417 People should be born before they die. YAGO - A Core of Semantic Knowledge

  39. SOFIE: Example YAGO ~ Worshipped People ~ bornInYear 1935 Saint Nicholas was born in the year 1417. diedInYear Elvis Presley was born in the year 1935. 347 "was born in the year" expresses bornInYear Pattern occurrence ~~> pattern meaning Pattern occurrence ~~> sentence meaning bornInYear 1417 People should be born before they die. YAGO - A Core of Semantic Knowledge

  40. SOFIE: Example YAGO Task 1: Find Patterns bornInYear 1935 Saint Nicholas was born in the year 1417. diedInYear Elvis Presley was born in the year 1935. 347 Task 2: Use semantic reasoning Task 3: Disambiguate entities Pattern occurrence ~~> pattern meaning Pattern occurrence ~~> sentence meaning bornInYear 1417 People should be born before they die. YAGO - A Core of Semantic Knowledge

  41. SOFIE: It‘s all logical formulae! YAGO Task 1: Find Patterns bornInYear(ElvisPresley,1935) diedInYear(NicholasOfMyra,347) occurs("was born in the year", SaintNicholas,1417) occurs("was born in the year", ElvisPresley,1935) Task 2: Use semantic reasoning Task 3: Disambiguate entities occurs(P,X,Y) /\ expresses(P,R) => R(X,Y) means(SaintNicholas,NicholasOfMyra) 0.8 means(SaintNicholas,NicholasOfFüe) 0.2 refersTo(SaintNicholas,NicholasOfFüe) ? bornOnDate(NicholasOfFüe, 1417) ? bornInYear(X,B) /\ diedInYear(X,D) => B<D YAGO - A Core of Semantic Knowledge

  42. SOFIE: Information Extraction as MAX SAT We have a Weighted MAX SAT Problem r(x,y) /\ s(x,z) => t(x,z) [w] ... Problem: ر The Weighted MAX SAT Problem is NP-hard ر Our instance contains YAGO (19 million facts) and textual facts (e.g. 10,000 facts) ر The best-known approximation algorithm cannot deal well with our specific instance YAGO - A Core of Semantic Knowledge

  43. SOFIE: A Unifying Framework r(a,b) => s(x,y)‏ Task 1: Find Patterns Polynomial time Algorithm Functional MAX SAT FOR i=1 TO 42 ... NEXT i Task 2: Use semantic reasoning Approximation Guarantee Task 3: Disambiguate entities 1417 NicholasOfFlüe [Suchanek et al: TR 2009] YAGO - A Core of Semantic Knowledge

  44. SOFIE: Experiments YAGO - A Core of Semantic Knowledge

  45. SOFIE: Summary SOFIE unifies 3 tasks in a single framework: SOFIE delivers رcanonicalized facts رof high precision Task 1: Find Patterns Task 2: Use semantic reasoning Task 3: Disambiguate entities YAGO - A Core of Semantic Knowledge

  46. But back to the original question... Is there any Australian guy taller than 1.90m who could help me out? YAGO - A Core of Semantic Knowledge

  47. Conclusion: Good News ر We made a great step towards gathering the knowledge of this world in a structured ontology YAGO SOFIE The world, I‘d like to say, even though some may contradict, is not as it seems. It rather seems as if the world seems not what it seems ر Christmas is safe! YAGO - A Core of Semantic Knowledge

  48. References [Suchanek et al.: KDD 2006] Fabian M. Suchanek, Georgiana Ifrim and Gerhard Weikum "Combining Linguistic and Statistical Analysis to Extract Relations from Web Documents" Conference on Knowledge Discovery and Data Mining (KDD 2006)‏ [Suchanek et al.: WWW 2007] Fabian M. Suchanek, Gjergji Kasneci and Gerhard Weikum "YAGO - A Core of Semantic Knowledge" International World Wide Web conference (WWW 2007)‏ [Suchanek et al.: JWS 2008] Fabian M. Suchanek, Gjergji Kasneci and Gerhard Weikum "YAGO - A Large Ontology from Wikipedia and WordNet" Suchanek et al.: JWS Journal of Web Semantics 2008 [Suchanek et al.: TR 2009] Fabian M. Suchanek, Mauro Sozio, Gerhard Weikum „SOFIE – A Self-Organizing Framework for Information Extraction“ Submitted to the International World Wide Web conference (WWW 2009)‏ See Technical Report or my PhD Thesis on http://mpii.de/~suchanek YAGO - A Core of Semantic Knowledge

  49. Acronyms LEILA: Learning to Extract Information by Linguistic Analysis YAGO: Yet Another Great Ontology SOFIE: Self-Organizing Framework for Information Extraction NAGA: Not another Google Answer YAGO - A Core of Semantic Knowledge

  50. YAGO: Thematic vs Conceptual Categories Australian boxers of German origin conceptual: thematic: Kick boxing in Australia Shallow linguistic noun phrase parsing: Premodifier Head Postmodifier Heuristics: If the head is a plural word, the category is conceptual YAGO - A Core of Semantic Knowledge

More Related