1 / 80

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

INTRODUCTION TO ARTIFICIAL INTELLIGENCE. Massimo Poesio LECTURE 10: Knowledge and The Social Web. `CYC convinced the AI community that creating a commonsense knowledge base by hand is impossible’ (Massimo, Lecture 1) . That may depend on how many people you put on to it!. THE SOCIAL WEB.

noleta
Download Presentation

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo PoesioLECTURE 10: Knowledge and The Social Web

  2. `CYC convinced the AI community that creating a commonsense knowledge base by hand is impossible’ (Massimo, Lecture 1) That may depend on how many people you put on to it!

  3. THE SOCIAL WEB • Increasingly, the Web is becoming not just a way to facilitate information exchange or commercial transactions, but also a tool to facilitate socialization (Facebook, LinkedIn, etc) • Also, where information can be collectively created

  4. SOCIAL CREATION OF KNOWLEDGE

  5. WIKIPEDIA The free encyclopedia that anyone can edit • Wikipedia is a free, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. • Wikipedia's articles have been written collaboratively by volunteers around the world. • Almost all of its articles can be edited by anyone who can access the Wikipedia website. ----http://en.wikipedia.org/wiki/Wikipeida

  6. WIKIPEDIA • Wikipedia is: 1. domain independent • it has a large coverage 2. up-to-date • to process current information 3. multilingual • to process information in many languages

  7. Title • Abstract • Infoboxes • Geo-coordinates • Categories • Images • Links • Other languages • Other wiki pages • To the web • Redirects • Disambiguates

  8. Encyclopedic knowledge in coreference resolution [The FCC] took [three specific actions] regarding [AT&T]. By a 4-0 vote, it allowed AT&T to continue offering special discount packages to big customers, called Tariff 12, rejecting appeals by AT&T competitors that the discounts were illegal. ….. [The agency] said that because MCI's offer had expired AT&T couldn't continue to offer its discount plan.

  9. Why Wikipedia may help addressing the encyclopedic knowledge problem http://en.wikipedia.org/wiki/FCC: The Federal Communications Commission (FCC) is an independent United States government agency, created, directed, and empowered by Congressionalstatute (see 47 U.S.C.§ 151 and 47 U.S.C.§ 154).

  10. Another interesting scenario A fresh mandate for [Mr Ahmadinejad] would, say his critics, consecrate the “revolution within a revolution” he has been trying to effect since his surprise electoral triumph in 2005. Best known to outsiders for his bellicose grandstanding, [the incumbent] is more familiar to Iranians as a radical and hyperactive populist who has used the tacit backing of his fellow conservative, Mr Khamenei, greatly to expand the powers of the presidency. Source: It could make a big difference, The Economist, Mar 19th 2009

  11. Why Wikipedia may help addressing the encyclopedic knowledge problem

  12. Wikipedia as Ontology • Unlike other standard ontologies, such as WordNet and Mesh, Wikipedia itself is not a structured thesaurus. • However, it is more… • Comprehensive: it contains 12 million articles (2.8 million in the English Wikipedia) • Accurate : A study by Giles (2005) found Wikipedia can compete with Encyclopædia Britannica in accuracy*. • Up to date: Current and emerging concepts are absorbed timely. * Giles, J. 2005. Internet encyclopaedias go head to head. Nature 438: 900–901.

  13. Wikipedia as Ontology • Moreover, Wikipedia has a well-formed structure • Each article only describes a single concept. • The title of the article is a short and well-formed phrase like a term in a traditional thesaurus.

  14. Wikipedia Article that describes the Concept Artificial intelligence

  15. Wikipedia as Ontology • Moreover, Wikipedia has a well-formed structure • Each article only describes a single concept • The title of the article is a short and well-formed phrase like a term in a traditional thesaurus. • Equivalent concepts are grouped together by redirected links.

  16. AI is redirected to its equivalent concept Artificial Intelligence

  17. Wikipedia as Ontology • Moreover, Wikipedia has a well-formed structure • Each article only describes a single concept • The title of the article is a short and well-formed phrase like a term in a traditional thesaurus. • Equivalent concepts are grouped together by redirected links. • It contains a hierarchical categorization system, in which each article belongs to at least one category.

  18. The concept Artificial Intelligence belongs to four categories: Artificial intelligence, Cybernetics, Formal sciences & Technology in society

  19. Wikipedia as Ontology • Moreover, Wikipedia has a well-formed structure • Each article only describes a single concept • The title of the article is a short and well-formed phrase like a term in a traditional thesaurus. • Equivalent concepts are grouped together by redirected links. • It contains a hierarchical categorization system, in which each article belongs to at least one category. • Polysemous concepts are disambiguated by Disambiguation Pages.

  20. The different meanings that Artificial intelligence may refer to are listed in its disambiguation page.

  21. SEMANTIC NETWORK KNOWLEDGE IN WIKIPEDIA • Taxonomic information: category structure • Attributes: infobox, text

  22. Wikipedia category network

  23. Deriving a taxonomy from Wikipedia (AAAI 2007) • Start with the category tree

  24. Deriving a taxonomy from Wikipedia (AAAI 2007) • Induce a subsumption hierarchy

  25. INFOBOXES • Collaborative content • Semi-structured data {{Infobox Writer | bgcolour = silver | name = Edgar Allan Poe | image = Edgar_Allan_Poe_2.jpg | caption = This [[daguerreotype]] of Poe was taken in 1848 ... | birth_date = {{birth date|1809|1|19|mf=y}} | birth_place = [[Boston, Massachusetts]] [[United States|U.S.]] | death_date = {{death date and age|1849|10|07|1809|01|19}} | death_place = [[Baltimore, Maryland]] [[United States|U.S.]] | occupation = Poet, short story writer, editor, literary critic | movement = [[Romanticism]], [[Dark romanticism]] | genre = [[Horror fiction]], [[Crime fiction]], [[Detective fiction]] | magnum_opus = The Raven | spouse = [[Virginia Eliza Clemm Poe]] ...

  26. DBPEDIA DBpedia.org is a effort to : • extract structured information from Wikipedia • make this information available on the Web under an open license • interlink the DBpedia dataset with other datasets on the Web

  27. The DBpedia Dataset 􀀟 1,600,000 concepts 􀀟 including 􀁺 58,000 persons 􀁺 70,000 places 􀁺 35,000 music albums 􀁺 12,000 films 􀀟 described by 91 million triples 􀀟 using 8,141 different properties. 􀀟 557,000 links to pictures 􀀟 1,300,000 links external web pages 􀀟 207,000 Wikipedia categories 􀀟 75,000 YAGO categories

  28. REPRESENTING EXTRACTED INFORMATION The DBpedia.org project uses the   Resource Description Framework (RDF) as a flexible data model for representing extracted information and for publishing it on the Web. It uses the  SPARQL query language to query this data. At  Developers Guide to Semantic Web Toolkits you find a development toolkit in your preferred programming language to process DBpedia data.

  29. Extracting Infobox Data (RDF Representation): http://en.wikipedia.org/wiki/Calgary http://dbpedia.org/resource/Calgary dbpedia:native_name Calgary”; dbpedia:altitude “1048”; dbpedia:population_city “988193”; dbpedia:population_metro “1079310”; mayor_name dbpedia:Dave_Bronconnier ; governing_body dbpedia:Calgary_City_Council; ...

  30. SPARQL : • SPARQL is a query language for RDF. • RDF is a directed, labeled graph data format for representing information in the Web. • This specification defines the syntax and semantics of the SPARQL query language for RDF. • SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware.

  31. The DBpedia SPARQL Endpoint • 􀀟 http://dbpedia.org/sparql • 􀀟 hosted on a OpenLink Virtuoso server • 􀀟 can answer SPARQL queries like • 􀁺 Give me all Sitcoms that are set in NYC? • 􀁺 All tennis players from Moscow? • 􀁺 All films by Quentin Tarentino? • 􀁺 All German musicians that were born in Berlin in the 19th century?

  32. Efforts such as Wikipedia indicate that many Web surfers may be willing to participate in collective resource-producing efforts Other initiatives: Citizen Science, Cognition and Language Laboratory, … This has been taken advantage of in AI Open Mind Commonsense (Singh) (collecting facts) Semantic Wikis WEB COLLABORATION FOR KNOWLEDGE ACQUISITION www.phrasedetectives.com

  33. WEB COLLABORATION PROJECTS • Open Mind Common Sense – Singh • Crater mapping (results) – Kanefsky • Learner / Learner2 / 1001 Paraphrases – Chklovski • FACTory – CyCORP • Hot or Not – 8 Days • ESP / Phetch / Verbosity / Peekaboom – von Ahn • Galaxy Zoo – Oxford University www.phrasedetectives.com

  34. OPEN MIND COMMONSENSE • A project started in 2000 by Push Singh to take advantage of people’s collaboration to collect commonsense

  35. WHAT’S IN OPEN MIND COMMONSENSE: CAR

  36. OPEN MIND COMMONSENSE: ADDING KNOWLEDGE

  37. OMCS ADDING KNOWLEDGE, 2

  38. OPEN MIND COMMONSENSE: CHECKING KNOWLEDGE

  39. FROM OPENMIND COMMONSENSE TO CONCEPT NET • ConceptNet (Havasi et al, 2009) is a semantic network extracted from OpenMind Commonsense assertions using simple heuristics

  40. CONCEPT NET

  41. FROM OPENMIND COMMONSENSE FACTS TO CONCEPTNET • A lime is a very sour fruit • isa(lime,fruit) • property_of(lime,very_sour)

  42. GAMES WITH A PURPOSE • Luis von Ahn pioneered a new approach to resource creation on the Web: GAMES WITH A PURPOSE, or GWAP, in which people, as a side effect of playing, perform tasks ‘computers are unable to perform’ (sic)

  43. GWAP vs OPEN MIND COMMONSENSE vs MECHANICAL TURK • GWAP do not rely on altruism or financial incentives to entice people to perform certain actions • The key property of games is that PEOPLE WANT TO PLAY THEM

  44. EXAMPLES OF GWAP • Games at www.gwap.com • ESP • Verbosity • TagATune • Other games • Peekaboom • Phetch

  45. ESP • The first GWAP developed by von Ahn and their group (2003 / 2004) • The problem: obtain accurate description of images to be used • To train image search engines • To develop machine learning approaches to vision • The goal: label the majority of the images on the Web

  46. ESP: the game

More Related