1 / 63

Introduction to Protégé for Absolute Beginners

Introduction to Protégé for Absolute Beginners. University at Buffalo August 11-12, 2012. Goal and Content of Tutorial. The goal of the tutorial is to explain how to translate ontologies into a language that can be processed by computers Three main sections by content:

yule
Download Presentation

Introduction to Protégé for Absolute Beginners

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Protégé for Absolute Beginners University at Buffalo August 11-12, 2012

  2. Goal and Content of Tutorial • The goal of the tutorial is to explain how to translate ontologies into a language that can be processed by computers • Three main sections by content: • Overview of the Web Ontology Language (OWL) • Hands-on training in Protégé, an OWL editor • Overview of SPARQL Protocol and RDF Query Language (SPARQL), a query language for retrieving and modifying ontologically grounded information

  3. Is the Goal Worthwhile?

  4. The Current State of Data Integration on the Web • Search engines return some remarkably precise results but the precision degrades as the topics become less standardized

  5. A Query Containing Standardized Terms…

  6. …Yields Very Good Results

  7. But as the Terms Become Less Standardized…

  8. …the Results Become Less Precise

  9. The Current State of Data Integration in the Enterprise • Using more than a single software application carries a risk of added cost to combine the information they create. • Databases carry very little meta-data about the content of information they contain • Spreadsheets most often carry less

  10. In the Social Network, Hashtags Cluster Information Into Categories • But the ambiguities of language reappear in the categories • and the lack of rigor in relating one category to another is an obstacle to machine based validation of usage.

  11. The Value Added by OWL Ontologies to Data Integration • Ontologies endow terms with machine processable definitions and disambiguate different senses of the same expression • Ontologies place restrictions on how terms can be related to other terms so that misuse and inconsistencies can be detected.

  12. The Ontologized Web, Enterprise and Social Network • What if creators of web pages, databases, and blogs used terminology from curated ontologies to annotate their content? • Standardized ways of describing the structures to represent data is accepted, why not extend that acceptance to annotation of content? • Expected Benefits: • The precision of search increase dramatically • Data from different sources can be merged • Gaps in information can be identified • Falsehoods and incoherent expressions can be detected

  13. Overview of Resource Description Framework (RDF)

  14. Resource Description Framework (RDF) • Designed to be a language for making assertions about resources • A Resource* is • an electronic document, an image, a source of information with a consistent purpose • not necessarily accessible via the Internet; e.g., human beings, corporations, and books in a library can also be resources. • an abstract concept such as the operators and operands of a mathematical equation or types of a relationship (e.g., "parent" or "employee“) *derived from RFC 3986-Uniform Resource Identifier (URI): Generic Syntax from http://tools.ietf.org/html/rfc3986

  15. Expressing Information in RDF • Statements are always expressed in the form of a triple: • Subject – Predicate – Object (a.k.a. RDF Triple) • Translating the statement “Austria’s GDP per capita is 30,500 Euros” into RDF requires breaking it into triples

  16. Universal Resource Identifiers (URIs) and Literals • URIs are unique names of resources • http://dbpedia.org/page/Austria • http://en.wikipedia.org/wiki/Austria • Literals • Can be a simple raw text value • can be annotated with a language tag as in “Austria”@en • can be typed with a datatype as in “30,500Euros”^^string

  17. Rules for RDF Statements • Subject and Predicate have to be URI named resources • Object – can be either a URI named resource or a literal

  18. Applying the Rules Using “dbpedia:”, “ro”, and “example:” as prefixes for: http://dbpedia.org/page, http://www.obofoundry.org/ro, and http://www.myexample.com/resource respectively, Which of the following are well-formed RDF statements?

  19. RDF Graphs Nodes dbpedia: Austria example: Austrian_GDPperCapita example:has_economic_indicator example:has_value Edges 30,500Euros^^string> The direction of the edges is always away from the subject and towards the object of the statement

  20. Graphing RDF How would the following be represented in a RDF Graph?

  21. Graphing RDF game1: MonopolyTokenBoot_Game1 mnply:represented_by game1: Monopoly Game_Game1 game1: Monopoly Player_1 mnply: Monopoly Player mnply:competes_in rdf:type mnply:has_role game1: Monopoly Banker_ Game1

  22. How far does RDF take us toward our goal? • The value of RDF lies in the use of URIs, as it allows distinct information sources to share a common meaning for terms • Every occurrence of the same URI is a reference to the same resource • There is no inference with RDF, no way to validate use of URIs.

  23. Overview of RDF Schema (RDFS)

  24. RDF Schema (RDFS) “RDF Schema defines classes and properties that may be used to describe classes, properties and other resources”* RDFS defines terms that can describe classes of things and the relationships that hold between these classes *RDF Vocabulary Description Language 1.0: RDF Schema from http://www.w3.org/TR/rdf-schema/

  25. The Need for RDFS • RDF can name, but not define, resources or the relationships that hold between them • But what about…

  26. The Need for RDFS • Machines cannot process elements of an expression that lie outside of RDF. To a machine our example looks like: • We need language elements that enable a machine to process relationships between entities

  27. RDFS Types • Allows a resource to be typed as a class (i.e. a collection of individuals) • Allows a class to be defined as a subclass of another class (i.e. all individuals that it contains are contained in the other) • Allows a property to be defined as a subproperty of another property

  28. RDFS Taxonomies • Enables the creation of taxonomies of both classes and properties Class Taxonomy Property Taxonomy

  29. RDFS Vocabulary rdfs:Resource rdfs:Class rdfs:Literal rdfs:Datatype rdfs:range rdfs:domain rdfs:subClassOf rdfs:subPropertyOf rdfs:label rdfs:comment rdfs:ContainerMembershipProperty rdfs:Member rdfs:seeAlso rdfs:isDefinedBy

  30. RDFS Vocabulary in Action • rdfs:subClassOf is used to assert that every instance of a class is an instance of another. • If a resource is rdf:typedbpedia:Apple, a reasoner will assert that the resource is also rdf:typedbpedia:Fruit example:NewtonsApple dbpedia:Apple dbpedia:Fruit rdfs:subClassOf rdf:type rdf:type

  31. RDFS Vocabulary in Action • rdfs:subPropertyOf is used to assert that every pair of resources that are related by a property are also related by another. • If Ann is the sister of Ben and is sister of is a subproperty of is sibling of, then a reasoner will assert that Ann is a sibling of Ben

  32. RDFS Vocabulary in Action • rdfs:domain is used to assert that a property is always applied to instances of one or more classes. • If Ann is related to Ben via the ex:is_sister_of property, a reasoner will assert that Ann is rdf:typeex:Female example: Ann Example: Ben example: Female example:is_ Sister_of rdf:type

  33. RDFS Vocabulary in Action • rdfs:range is used to assert that the instances of the object of a property are always of one or more classes or datatypes • If Newton’s apple is related to Newton’s apple tree via the ex:is_borne_by property, a reasoner will assert that Newton’s apple tree is rdf:typedbpedia:Plant example: Newton’s Apple Example: Newton’s Apple Tree dbpedia: Plant example:is_ borne_by rdf:type

  34. RDFS Vocabulary in Action • rdfs:label is used to provide a human readable version of a resource’s name. • If a GUID is used as the identifier for the class of Apple, then use rdfs:label to assign as many human readable versions as desired.

  35. RDFS Vocabulary in Action • rdfs:comment is used to provide a human-readable description of a resource Both comments are reused from http://dbpedia.org/page/Apple

  36. RDFS Vocabulary in Action • rdfs:seeAlso is used to assert that a resource provides additional information about the subject resource.

  37. RDFS Vocabulary in Action • rdfs:isDefinedBy is used to assert that a resource defines the subject resource.

  38. How far does RDFS take us toward our goal? • Contains elements that enable machine inferencing on necessary conditions (e.g. Apples are the fruit of the apple tree) • Doesn’t allow restrictions on classes that would enable inferencing on sufficient conditions (e.g. Apples are the fruit of the apple tree) • Doesn’t provide a way to exclude resources from class membership, can’t validate assertions.

  39. Overview of the Web Ontology language (OWL)

  40. Web Ontology Language (OWL*) • OWL is the descendant of Knowledge Representation Languages of the 1990’s such as Simple HTML Ontology Extensions (SHOE) and Ontology Inference Layer (OIL) and from the DARPA Agent Markup Language (DAML) • The initial version of OWL became a formal W3C Recommendation on February 10, 2004 • OWL 2 became a W3C Standard on October 27, 2009 * why “OWL” instead of “WOL” http://lists.w3.org/Archives/Public/www-webont-wg/2001Dec/0169.html

  41. The Need for OWL • RDFS lacks the expressive power allow inferences about individuals beyond their class membership. • Based on this equivalence a machine can infer only that the two classes have the same instances. • We want to enable a machine to infer the attributes of an individual based upon the definition of the class of which they are members

  42. OWL Usage “The W3C OWL 2 Web Ontology Language (OWL) is a Semantic Web language designed to represent rich and complex knowledge about things, groups of things, and relations between things. OWL is a computational logic-based language such that knowledge expressed in OWL can bereasoned with by computer programs either to verify the consistency of that knowledge ortomakeimplicit knowledge explicit.”* * http://www.w3.org/TR/owl2-primer/

  43. Defining Classes -Enumeration Use owl:oneOf to enumerate the members of a class In Manchester Syntax Class: MonopolyToken EquivalentTo: {Battleship , Boot , Car , Dog , Thimble , Top_Hat , Wheelbarrow, Iron} SubClassOf: Thing

  44. Defining Classes - Restrictions • owl:Restriction creates a class defined using an object property and either: • a value constraint which places a constraint on the range of the property when applied to this particular class • e.g. the rdfs:range of the is_borne_by property might be plant, but when defining apple we would constrain the range to the class of apple trees • a cardinality constraint which places a constraint on the number of values a property can take in the context of a particular class • e.g. there can be no more than 8 players in a game of Monopoly

  45. Additional Inferences Gained Through Restrictions Without a restriction all that can be inferred about an improved property is that it must also be a property Class: MonopolyImprovedProperty SubClassOf: MonopolyProperty Adding a restriction adds the information that an improved property must be a property and that it must be the location of some building Class: MonopolyImprovedProperty EquivalentTo: location_of some MonopolyBuilding SubClassOf: MonopolyProperty

  46. rdfs:subClassOf vs. owl:equivalentClass property that is the location of a building ? is a subclass of Virginia Place is the location of House 1 ? improved property property that is the location of a building is an equivalent class of Virginia Place is the location of House 1 improved property

  47. owl:allValuesFrom vs. owl:someValuesFrom • owl:allValuesFrom constrains the object property so that its value must come from the specified class or data range • Example: A mortgaged property is one such that it is owned only by the bank • owl:someValuesFrom constrains the object property so that at least one of its values must come from the specified class or data range • Example: An improved property is the location of some building

  48. owl:hasValue • The owl:hasValue constraint limits an object property to a given value, which can be either an individual or a data value. For example we could use this constraint to assert that all monopoly railroads have a price of 200. Class: MonopolyRailroad SubClassOf: has_price value 200, MonopolyProperty • Given an resource that is a Monopoly Railroad a reasoner will infer that its price is 200. game1:ReadingRailroad mnply: Monopoly Railroad mnply: has_price = 200 rdfs:subClassOf rdf:type 200 mnply:has_price

  49. owl:hasValue • To define the class of New York City building we can use owl:hasValue on the property of located_in and the individual NewYorkCity Class: NewYorkCityBuilding SubClassOf: located_in value NewYorkCity, Building • Given an resource that is a New York City building a reasoner will infer that its location is New York City. example: EmpireState Building example: NewYorkCity Building example: located_in NYC rdfs:subClassOf rdf:type example:NewYorkCity example:located_in

  50. Cardinality Constraints • Useful in expressing that a class has an exact number of relationships to another class or data range. Example: A turn has exactly one player as a participant and exactly one integer as its ordinal value Class: MonopolyTurn Annotations: rdfs:label "Monopoly turn"^^xsd:string SubClassOf: has_ordinal_value exactly 1 xsd:integer, has_participant exactly 1 MonopolyPlayer, occurs_containing some MonopolyRollOfDice, occurs_during some MonopolyRound, MonopolyEvent

More Related