400 likes | 634 Views
Semantic Web and Linked Data. By Harsh Pareek 07005007 Raman Sharma 07005010 Sumit Somani 07005012 Shiv Shankar 07005026. Outline. Outline Motivation Semantic Web: History and development Linked Open Data Linked Open Data Technologies DBpedia : An example of LOD Accessing LOD.
E N D
Semantic Web and Linked Data By Harsh Pareek 07005007 Raman Sharma 07005010 SumitSomani 07005012 Shiv Shankar 07005026
Outline • Outline • Motivation • Semantic Web: History and development • Linked Open Data • Linked Open Data Technologies • DBpedia: An example of LOD • Accessing LOD
Motivation • Limitations of NLP • “In 2003, President of US ordered Iraq invasion. George believed it to be a great decision.” • How do we know that George referred here is referring to George Bush, and he was then President of US. • It is due to world knowledge • Semantic Web helps us overcome this lack of world knowledge and helps in processing the language. In this case, co-reference was solved.
Motivation • Query: “List all phones which have a battery life of 12 hours and cost less than Rs. 10000” • This data may not be explicitly present on the web • But, the information on web is enough to answer this query. Lack of structured data is the bottleneck. • Need to represent information about phones in a database-like format and perform sql-like queries
Motivation • Query: “List all phones which have a battery life of 12 hours and cost less than Rs. 10000” • Many valid pages may not contain the word 10000 but we should be able to infer that the price is <10000 • Semantic Web could be used to build Query systems.
Motivation Original Doctor Appointment: Thu 9:00-10:00am New Constraint : Thu 9:30 am onwards • If we store data on web in semantic format, machine could realize conflict, and would search for alternate doctor free times or similar doctors. • Tim Berners-Lee calls this as “optimization”. Semantic web could make the web smarter, mechanically usable and accurate.
Semantic Web “The semantic web is not a separate web, but an extension of the current one, in which information is given a well-defined meaning, better enabling computer and people work better in cooperation”. [1] Courtesy : Berners-Lee T., Hendler, J., Lassila, O. (2001) The Semantic Web. Scientific American 284(5):34-43 Pic Courtesy: wikipedia.org www. mechanicsnationalbank.com
Semantic Web: History Courtesy : http://novaspivack.typepad.com/nova_spivacks_weblog/2007/10/web-30----the-a.html
Ontology • Ontology is a formal representation of knowledge as a set of concepts within a domain and the relationship among those concepts. • We need ontologies for :- • Sharing common understanding of information. • Reuse of domain knowledge. • Making domain assumptions explicit.
Ontology • Camera Ontology Courtesy : Minsoo Kim, Minkoo Kim: Developing Protégé Plug-in: OWL Ontology Visualization using Social Network. JIPS 4(2): 61-66 (2008)
Ontology • Examples: • Wordnet • FOAF (Friend of a Friend) • Gene Ontology • GeoPolitical Ontology • Thought treasure ontology • Cyc • Jamendo • Customer Complaint Ontology Courtesy : http://en.wikipedia.org/wiki/Ontology_(information_science)#Examples_of_published_ontologies
From Ontology to Linked Data • But ontologies are domain specific • But to match semantic search requirements we have to use all ontologies together • How can we use all the available ontologies • The answer is to create link all of them together, making a meta-ontology Courtesy : http://en.wikipedia.org/wiki/Ontology_(information_science)#Examples_of_published_ontologies
Linked Open Data • A way of linking these ontologies so as to • encourage reuse • reduce redundancy • maximize inter-connectedness • enable network effects to add value to data
Linked Open Data Technology (1/2) • URI (Unique Resource Identifier) -> The unique name by which something is referred • HTTP (Hyper Text Transfer Protocol) -> Provides basic access mechanism using WWW for lookup • RDF (Resource Description Framework) -> Data format to describe relationships among entities • OWL (Web Ontology Language) -> Provides a common understanding of concepts aiding in reasoning
Linked Open Data Technology (2/2) • Use URI for unique nomenclature for things – anything, not just web pages – all kinds of information resources • Use HTTP as URI – provides globally unique names – allows using existing web for lookup • Encode useful information in RDF – when servicing a URI lookup • Include RDF links to other URI – enable discovery of related information • Encode further information using OWL – enable reasoning about information across domains
RDF - OWL <rdf:Descriptionrdf:about="subject"> <predicaterdf:resource="object"/><predicate>literal value</predicate> <rdf:Description> Courtesy : http://www.linkeddatatools.com/introducing-rdf
RDF - OWL Courtesy : http://www.linkeddatatools.com/introducing-rdf-part-2
RDF – OWL : An Example (1/3) <rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:feature="http://www.linkeddatatools.com/clothing-features#"> <rdf:Descriptionrdf:about="http://www.linkeddatatools.com/clothes#t-shirt"> <feature:size>12</feature:size> <feature:colorrdf:resource="http://www.linkeddatatools.com/colors# white"/> </rdf:Description> </rdf:RDF> Courtesy : http://www.linkeddatatools.com/introducing-rdf-part-2
RDF – OWL : An Example (2/3) Courtesy: NeonTool
RDF - OWL: An Example (3/3) • Critique • Aka Semantic Modelling • Requires Human Intelligence • Difficult to be done by machines <owl:Classrdf:ID="SpaceTimeThing"> <rdfs:labelxml:lang="en">things in our time and space</rdfs:label> <rdfs:commentxml:lang="en">A specialisation of #$SpatialThing and #$TemporalThing. A collection of things that physically exist in our universe.</rdfs:comment> <rdfs:subClassOfrdf:resource="#SpatialThing"/> <rdfs:subClassOfrdf:resource="#TemporalThing”/> </owl:Class> Courtesy:http://www.qrst.de/ontology/owl.xml Pic Courtesy: http://www. pctechs.biz, www. thedoublethink.com
Linked Open Data Courtesy: http://linkeddata.org/
DBpedia • Wikipedia contains structural information such as • "infobox" tables • categorisation information • Images • geo-coordinates • links to external Web pages • Dbpedia lets us treat Wikipedia as a database which can be queried Courtesy: http://en.wikipedia.org/wiki/DBpedia
Infobox Courtesy: http://en.wikipedia.org/wiki/Sachin_Tendulkar
DBpedia • Contains: • 3.4 million things • Abstracts in upto 92 different languages • 1,460,000 links to images • 5,543,000 links to external web pages • 4,887,000 external links into other RDF datasets • 565,000 Wikipedia categories
How to access Linked Data Querying DBpedia • Offline: Linked Open Data Crawl • Billion Triple Challenge Dataset • SPARQL PREFIX dbprop: <http://dbpedia.org/property/> PREFIX db: <http://dbpedia.org/resource/> SELECT ?who ?work ?genre WHERE { db:Tokyo_Mew_Mewdbprop:illustrator ?who . ?work dbprop:author ?who . OPTIONAL { ?work dbprop:genre ?genre } . }
SPARQL Courtesy: http://dbpedia.org/sparql
Document Web vs Linked Data Web of Linked Documents Web of Linked Data A global database Machine interpretation Primary objectsentities or things Links between entities High Degree of structure Explicit Semantics of content and links • A globalfilesystem • Human usage • Primary objects documents • Links between documents • Low degree of structure • Implicit Semantics of content and links
Conclusion • Imposing structure and standards on available informationincreasing its usability and value • As semantic web spreads it would become priceless, allowing machines to analyze all the data on the Web – the content, links, and even transactions between people and computer • Searching over all of linked data is possible but at current stage not effective.As the structure becomes larger and more accepted it would become easier • Ontology creation still requires human intelligence.But by "bolstering human intelligence" definition of AI, we could win the battle
References • Berners-Lee T., Hendler, J., Lassila, O. (2001) The Semantic Web. Scientific American 284(5):34-43 • Christian Bizer, Tom Heath, Tim Berners-Lee. Linked Data – The Story So Far. IJSWIS • http://linkeddata.org/ Further Reading • NLP and the Semantic Web http://www.csc.villanova.edu/~nlp/pres1/presentation.pdf • Proceedings of the NLP4SW conference: http://www.dcs.shef.ac.uk/~diana/courses/lrec-nlp-semweb-tutorial.html
Ontology Learning • Semantic annotation – annotate in the texts all mentions ofinstances relating to concepts in the ontology • Ontology learning – automatically derive an ontology from Texts • Ontology population – given an ontology, populate the concepts with instances derived automatically from a text
Ontology Learning: Hearst Patterns[1992] • Such NP as {NP}* {or|and} NP • “such games as baseball and cricket” • NP {,NP}* {,} {and|or} other NP • “rabbits and other animals” • But, “rabbits and other pets” • NP {,} including {NP,}* {or|and} NP • “fruits including apples and pears” • NP{,} especially {NP,}*{or|and}NP • “Europeans, especially Italians” • But, “US Presidents, especially democrats” • Extended by newer systems such as KnowItAll
NLP for Semantic Web So how does Natural language processing fit in? • Semantic Web requires machine-interpretable semantics in order to process textual information on the internet • Natural language processing is vital to the success of the semantic web because it is the method of communication between humans and software agents • Parsing, knowledge representation, information extraction, disambiguation, term recognition and semantic analysis are used in many semantic web technologies
NLP for Semantic Web • Linked Open Data is mostly academic and volunteer work • Converting the current snapshot of the web to Semantic Web requires effort and time • This is disregarding the fact that the Web is growing at very high rates • Semi-automated mechanisms using NLP techniques are required to keep up with the increasing content
Semantic Web for NLP • Entity Disambiguation • Word Sense Disambiguation using ontologies • Adds context to information • Allows using richer lexicon • Use world knowledge • Eg. “Senator Green gave the green light for the green bill in parliament” • Eg. “Moses led the Jews to the banks of Jordan”
Semantic Web for NLP • Question Answering • “Sir Edward Heath died from pneumonia” • Sir Edward Heath -> UK Prime Minister->politician • Died from -> killed by • Pneumonia->disease • “Has a politician died of a lung disease?”
Would Web Search + NLP win Jeopardy? Source: Stephen Wolfram’s Blog(http://blog.stephenwolfram.com/2011/01/jeopardy-ibm-and-wolframalpha/)