1 / 101

Semantic Web, owl & Protégé

Matthew J Wood CS 570 – Topics in Artificial Intelligence Spring 2013. Semantic Web, owl & Protégé. Today’s Web. Most of today’s Web content is suitable for human consumption

Download Presentation

Semantic Web, owl & Protégé

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Matthew J Wood CS 570 – Topics in Artificial Intelligence Spring 2013 Semantic Web, owl & Protégé

  2. Today’s Web • Most of today’s Web content is suitable for human consumption • Even Web content that is generated automatically from databases is usually presented without the original structural information found in databases • Typical Web uses today people’s • seeking and making use of information, searching for and getting in touch with other people, reviewing catalogs of online stores and ordering products by filling out forms

  3. Keyword-Based Search Engines • Current Web activities are not particularly well supported by software tools • Except forkeyword-based search engines(e.g. Google, AltaVista, Yahoo) • The Web would not have been the huge success it was, were it not for search engines

  4. Problems of Keyword-Based Search Engines • High recall, low precision. • Low or no recall • Results are highly sensitive to vocabulary • Results are single Web pages • Human involvement is necessary to interpret and combine results • Results of Web searches are not readily accessible by other software tools

  5. The Key Problem of Today’s Web • The meaning of Web content is not machine-accessible: lack of semantics • It is simply difficult to distinguish the meaning between these two sentences: I am a professor of computer science. I am a professor of computer science, you may think. Well, . . .

  6. The Semantic Web Approach • Represent Web content in a form that is more easily machine-processable. • Use intelligent techniques to take advantage of these representations. • The Semantic Web will gradually evolve out of the existing Web, it is not a competition to the current WWW

  7. What is Semantic Web? • Semantic web is a term used more specifically to refer to format and technologies that enable it. • It defines the collection, structuring and recovery of linked data that are enabled by the technologies that provide a formal description of concepts, terms, and relationships within a given knowledge domain.

  8. What is Semantic Web? • Sematic Web is the extension of the World Wide Web that enables people to share content beyond the boundaries of applications and websites • Semantic web was term coined by Tim Berners-Lee, the inventor of World Wide Web and the director of the World Wide Web Consortium. • He defines it as “a web of data that can be processed directly and indirectly by machines .

  9. Why Semantic Web? • Semantic Web is about common formats for integration and combination of data from diverse sources. • Secondly, it is about language for recording the relationship data and the real world objects.

  10. Semantic Web Components • Resource Description Framework (RDF) • RDF Schema (RDFS) • Simple Knowledge Organization System (SKOS) • SPARQL, an RDF query language • Notation3 (N3) • N-Triples • Turtle (Terse RDF Triple Language) • Web Ontology Language (OWL)

  11. Explicit Metadata • This representation is far more easily processable by machines • Metadata: data about data • Metadata capture part of the meaning of data • Semantic Web does not rely on text-based manipulation, but rather on machine-processable metadata

  12. On HTML • Web content is currently formatted for human readers rather than programs • HTML is the predominant language in which Web pages are written (directly or using tools) • Vocabulary describes presentation

  13. An HTML Example <h1>Agilitas Physiotherapy Centre</h1> Welcome to the home page of the Agilitas Physiotherapy Centre. Do you feel pain? Have you had an injury? Let our staff Lisa Davenport, Kelly Townsend (our lovely secretary) and Steve Matthews take care of your body and soul. <h2>Consultation hours</h2> Mon 11am - 7pm<br> Tue 11am - 7pm<br> Wed 3pm - 7pm<br> Thu 11am - 7pm<br> Fri 11am - 3pm<p> But note that we do not offer consultation during the weeks of the <a href=". . .">State Of Origin</a> games.

  14. Problems with HTML • Humans have no problem with this • Machines (software agents) do: • How distinguish therapists from the secretary, • How determine exact consultation hours • They would have to follow the link to the State Of Origin games to find when they take place.

  15. XML (eXtended Markup Language) • A better representation of data • XML is a flexible text format that is widely used to structure, store, and transport data. • XML is different from HTML because it is not about displaying data. • In XML (differently from HTML) you create your own tags to annotate data. • XML is used to create other languages such as: XHTML, RSS, RDF, OWL, etc.

  16. An XML Example <bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> </bookstore>

  17. RDF (Resource Description Framework) • RDF: a standard for describing resources on the Web • The meaning of data is encoded in sets of triples. • Triples are “subject, predicate, object” statements. • Each element of a triple is identified by a URI. • URIs represent both resources and relations. • RDF is written in XML • RDF is to Semantic Web what HTML was to the Web. Harry Potter has as author J. K. Rowling.

  18. An RDF Example http://en.wikipedia.org/wiki/J._K._Rowling dc:creator http://en.wikipedia.org/wiki/ Harry_Potter <rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns# xmlns:dc=http://purl.org/dc/elements/1.1/> <rdf:Description rdf:about=“http://en.wikipedia.org/wiki/Harry_Potter”> <dc:creator=“http://en.wikipedia.org/wiki/J._K._Rowling”> </rdf:Description> </rdf:RDF>

  19. Ontologies and OWL • An ontology is an explicit description of things and their relations. • OWL serves to write ontologies for the Web. • OWL is written in XML and built on top of RDF. • You can think of OWL as an object-oriented language that defines classes, hierarchy of classes, attributes, relations, etc. • OWL is designed to support inference (subsumption and classification) • OWL is more expressive than RDF.

  20. Typical Components of Ontologies • Terms denote important concepts (classes of objects) of the domain • e.g. professors, staff, students, courses, departments • Relationships between these terms: typically class hierarchies • a class C to be a subclass of another class C' if every object in C is also included in C' • e.g. all professors are staff members

  21. Further Components of Ontologies • Properties: • e.g. X teaches Y • Value restrictions • e.g. only faculty members can teach courses • Disjointness statements • e.g. faculty and general staff are disjoint • Logical relationships between objects • e.g. every department must include at least 10 faculty

  22. Example of a Class Hierarchy

  23. The Role of Ontologies on the Web • Ontologies provide a shared understanding of a domain: semantic interoperability • overcome differences in terminology • mappings between ontologies • Ontologies are useful for the organization and navigation of Web sites

  24. The Role of Ontologies in Web Search • Ontologies are useful for improving the accuracy of Web searches • search engines can look for pages that refer to a precise concept in an ontology • Web searches can exploit generalization/specialization information • If a query fails to find any relevant documents, the search engine may suggest to the user a more general query. • If too many answers are retrieved, the search engine may suggest to the user some specializations.

  25. Web Ontology Languages (2) OWL • A richer ontology language • relations between classes • e.g., disjointness • cardinality • e.g. “exactly one” • richer typing of properties • characteristics of properties (e.g., symmetry)

  26. Software Agents • Software agents work autonomously and proactively • They evolved out of object oriented and compontent-based programming • A personal agent on the Semantic Web will: • receive some tasks and preferences from the person • seek information from Web sources, communicate with other agents • compare information about user requirements and preferences, make certain choices • give answers to the user

  27. Intelligent Personal Agents A Semantic Web Primer

  28. Semantic Web Agent Technologies • Metadata • Identify and extract information from Web sources • Ontologies • Web searches, interpret retrieved information • Communicate with other agents • Logic • Process retrieved information, draw conclusions

  29. Drawbacks of XML • XML is a universal metalanguage for defining markup • It provides a uniform framework for interchange of data and metadata between applications • However, XML does not provide any means of talking about the semantics (meaning) of data • E.g., there is no intended meaning associated with the nesting of tags • It is up to each application to interpret the nesting.

  30. Requirements for Ontology Languages • Ontology languages allow users to write explicit, formal conceptualizations of domain models • The main requirements are: • a well-defined syntax • efficient reasoning support • a formal semantics • sufficient expressive power • convenience of expression

  31. Tradeoff between Expressive Power and Efficient Reasoning Support • The richer the language is, the more inefficient the reasoning support becomes • Sometimes it crosses the border of noncomputability • We need a compromise: • A language supported by reasonably efficient reasoners • A language that can express large classes of ontologies and knowledge.

  32. Reasoning About Knowledge in Ontology Languages • Class membership • If x is an instance of a class C, and C is a subclass of D, then we can infer that x is an instance of D • Equivalence of classes • If class A is equivalent to class B, and class B is equivalent to class C, then A is equivalent to C, too

  33. Reasoning About Knowledge in Ontology Languages (2) • Consistency • X instance of classes A and B, but A and B are disjoint • This is an indication of an error in the ontology • Classification • Certain property-value pairs are a sufficient condition for membership in a class A; if an individual x satisfies such conditions, we can conclude that x must be an instance of A

  34. Uses for Reasoning • Reasoning support is important for • checking the consistency of the ontology and the knowledge • checking for unintended relationships between classes • automatically classifying instances in classes • Checks like the preceding ones are valuable for • designing large ontologies, where multiple authors are involved • integrating and sharing ontologies from various sources

  35. Reasoning Support for OWL • Semantics is a prerequisite for reasoning support • Formal semantics and reasoning support are usually provided by • mapping an ontology language to a known logical formalism • using automated reasoners that already exist for those formalisms • OWL is (partially) mapped on a description logic, and makes use of reasoners such as FaCT and RACER • Description logics are a subset of predicate logic for which efficient reasoning support is possible

  36. Three Species of OWL • W3C’sWeb Ontology Working Group defined OWL as three different sublanguages: • OWL Full • OWL DL • OWL Lite • Each sublanguage geared toward fulfilling different aspects of requirements

  37. OWL DL • OWL DL is so named due to its correspondence with description logic. • Description logic (DL) is a family of formal knowledge representation languages. It is more expressive than propositional logic • It models concepts, roles and individuals, and their relationships • It is used in AI for formal reasoning on the concept of application domain known as Terminological knowledge

  38. Protege Tutorial

  39. What is protege? • Protege is a free, open-source platform to construct domain models and knowledge-based applications with ontologies. • Ontologies range from taxonomies, classifications, database schemas to fully axiomatized theories. • Ontologies are now central to many applications such as scientific knowledge portals, information management and integration systems, electronic commerce and web services

  40. What can you do with Protégé? • Create a new ontology from scratch • Download and extend an existing ontology • Export ontologies in a variety of formats • OWL • RDF • XML Schema

  41. How it works • Objects in the domain are expressed through a series of interrelated classes • Class hierarchy is similar to that used by object-oriented languages such as Java • Superclasses • Subclasses • Sibling Classes • Ancestor Classes, etc. • Heavy reliance on inheritance • Unlike Java, Protégé supports multiple inheritance

  42. Individuals and Properties • Individual domain objects are expressed as class members • Class members have object properties that relate them to members of the same class or other classes • Also have data properties • Expressed by data type – integer, String, etc. • If particular data property does not fit a predefined data type it can be entered as text • Ex. Members of the class Soccer_Players have ‘has fanpage’ data property that is expressed as a URL

  43. Annotations Provide definitions and comments on the ontology and its contents Can provide annotations for the entire ontology, a specific class, a member of a class, etc.

  44. How It Works (cont.) Using Protégé in conjunction with GraphViz software allows users to view the ontology as a semantic network Protégé relies on semantic reasoners to implement description logic

  45. Semantic Reasoners Inference engine for description logics Inference process is carried out via forward and backward chaining Protégé supports various reasoners such as FaCT++ and HermiT

  46. FaCT++ Semantic reasoner used for OWL DL ontologies Implemented in C++ and supported by Protege Converts the KB into an internal representation using various optimization techniques Uses dependency directed backtracking to check satisfiability of the KB

  47. Install Protege • Go to http://protege.stanford.edu/doc/owl/getting-started.html to download protege (version 3.x) • Protege OWL editor is built with the full installation of protege platform. During the install process, choose the “Basic+OWL” option. • For more details: http://protege.stanford.edu/doc/owl/getting-started.html

  48. Protege • There are two main ways of modelling ontologies: • Frame-based • OWL • Each has its own user interface • Protege Frames editor: enables users to build and populate ontologies that are frame-based, in accordance with OKBC (Open Knowledge Base Connectivity Protocol). • Classes • Slots for properties and relationships • Instances for class • Protege OWL editor: enables users to build ontology for the Semantic Web, in particular to OWL • Classes • Properties • Instances • reasoning

  49. Building an OWL Ontology • E2: Create a new OWL project • Start protege • File – New Project – OWL/RDF files – Ontology URI (http://www.pizza.com/ontologies/pizza.owl) – OWL DL – Properties View • A new empty Protege-OWL project has been created. • Save it in your local file as pizza.owl

  50. Named Classes • Go to OWL Classes tab • The empty class tree contains one class called owl:Thing, which is superclass of everything. • E3: Create subclasses Pizza, PizzaTopping and PizzaBase. They are subclasses of owl:Thing. • Naming convention • no special naming convention • consistency

More Related