500 likes | 515 Views
What is the Semantic Web?. “The Semantic Web is an extension of the current web in which information is given a well-defined meaning, better enabling computers and people to work in cooperation.” – Tim Berners-Lee, et al [The Semantic Web, Scientific American, 2001.].
E N D
What is the Semantic Web? “The Semantic Web is an extension of the current web in which information is given a well-defined meaning, better enabling computers and people to work in cooperation.” – Tim Berners-Lee, et al [The Semantic Web, Scientific American, 2001.] “A set of standards and best practices for sharing data and the semantics of that data over the Web for use by applications” -- Bob DuCharme [Learning SRARQL, 2013.] Standards: 1. RDF data model 2. SPARQL query language 3. RDFS and OWL standards for storing vocabularies and ontologies. Best practices include the use of URIs (IRIs) to refer to entities on the web and use of standards. 1
Limitations of the WWW To get a specific question answered, a person must: 1. Browse extensively, i.e. collect all the relevant information – very tedious and time consuming, because there is too much information with too little structure on the web. 2. Interpret the collected information, accounting for the fact that web contents is heterogeneous in terms of contents, structure, and character encoding. 3. Integrate that information to derive the answer – the result is only as good as the integration process itself AND only people can carry out this task as of now BECAUSE only people can derive new information from multiple possibly heterogeneous pieces of information. Note: Integration requires not only understanding of the meaning of the information to be integrated, but also ability to perform reasoning, i.e. what follows from what we know. 2
Limitations of the WWW -- Examples Example: google Public Universities in CT -- returns a pretty focused results, because the query is very specific. Example: google which public universities in CT offer a computer science program -- returns 5,370,000 results, none of which are specific enough Example: [Berners-Lee, 2009] google proteins involved in signal transduction that are also related to pyramidal neurons – returned 223,000 unhelpful hits, BUT the same query posted on a [linked data network] returned 32 highly relevant hits. 3
Limitations of the WWW – More Examples 1. Find that landmark article on data integration written by an Indian researcher in the 1990s. Are lobsters spiders? Which car is called a “duck” in German? 2. 3. Finding answers to these questions require intelligent integration of content from multiple websites, as well as background knowledge. Note that all the required knowledge is on the web, but it is not machine- processible (for the most part) -- it is intended to be processed by people. How about the so-called “intelligent personal assistants” like Siri, Cortana and Google Assistant? Not as “intelligent” as they seem to be – they are intended to help general user navigate and collect information, but the interpretation and integration of this information is up to the user. 4
What is missing on the web to make it accessible to computers? 1. Open standards for describing information so that web applications can use it to address queries, not just do search. That is, we need suitable representation languages. Example: We want to be able to say that Every student takes courses. Every CS student is a student. Every CS student writes code. Bob is a CS student. 2. Methods for obtaining further information from such descriptions, such as logical deduction which is the basis for automated reasoning. Example: Bob is a student. All students work hard. Therefore, Bob works hard. 5
Example: A (smart) data integration agent (adapted from Liyang Yu – A Developer’s Guide to the Semantic web) You want to learn about the topics covered in CS 407 and you ask your “personal web agent” to get that information for you. Assume it starts with the published course schedule at https://ssb.ccsu.edu/pls/ssb_cPROD/byskcsob.P_TermSel, where it finds the following information: ns0:cs407 ns0:name “CS 407” . ns0:cs407 ns0:CRN <https://...CS&crse_in=407&crn_in=41939 > . ns0:cs407 ns0:taughtBy <ns0:_x> . ns0:_x ns0:name “Neli P. Zlatareva” . ns0:_x ns0:homepage <http://www.cs.ccsu.edu/~neli> . … and a few more statements 6
Example contd. Next, your agent visits the instructor’s homepage, where it collects more information: ns1:zlatareva ns1: firstname “Neli” . ns1:zlatareva ns1: lastname “Zlatareva” . ns1:zlatareva ns1:homepage <http://www.cs.ccsu.edu/~neli> . ns1:zlatareva ns1: teaches <ns1:_c> . ns1:_c ns1:name “Semantic Web Technologies” . ns1:_c ns1:CRN <https://...CS&crse_in=407&crn_in=41939 > . ns1:_c ns1:textbook <ns1:_b> . ns1:_b ns1:ISBN “978-1-4200-9050-5” . … and so on At this point, the agent can conclude that the instructor referred to by <ns0:_x> has first name “Neli” and last name “Zlatareva”, and CS 407 and Semantic Web Technologies refer to the same class because teaches and taughtBy are inverse properties. Thus, the following statement will be added: ns0:cs407 sameAs ns1:_c . 7
Example contd. From here, your agent will go to www.amazon.com to get information ns2:textbook ns2:ISBN “978-1-4200-9050-5” . ns2:textbook ns2:title “ Foundations of SWT” . ns2:textbook ns2:part_1 “ RDF” . ns2:textbook ns2:part_2 “ OWL” . ns2:textbook ns2:part_3 “ Rules and Queries” . ns2:textbook ns2:part_4 “ Beyond Foundations: Ontology Engineering, Foundations” . ns2:textbook ns2:part_5 “ Appendices: XML, Sets, Logic” . Now the agent can conclude that the book referred to by < ns1:_b > and ns2:textbook refer to the same book, and the following statement will be added: ns2:textbook sameAs ns1:_b . The properties associated with ns2:textbook, namely title, part_1, etc. provide the answer to the initial query. 8
What does it take to implement our agent? 1. Since the agent browses different web sites to collect information, each web site must describe that information in an uniform way (in RDF format as in hypothetical example, BUT this is not the case now). The information at different web sites cannot be arbitrary – it should be consistent with agreed common terms and relations. For example, to describe a person, we use some common terms such as name, birthday, homepage, etc. See FOAF (Friend Of A Friend) vocabulary. The agent must “understand” each chunk of information it collects. The agent must be able to conduct reasoning based on its understanding of the common terms and relations. Example: knowing that resources ns2:textbook and ns1:_b have the same ISBN numbers, the agent must conclude that they are the same resource. The agent must be able to address queries about the information it has collected. … and more stuff, of course. 2. 3. 4. 5. 6. 9
Linked Data Our hypothetical agent visited 3 web sites. This can be possible only if there are links between the entities in these three sites, as shown here: “Neli” ns1:firstname ns0:cs407 ns1:zlatareva ns1:lastneme ns0:taughtBy ns1:teaches “Zlatareva” ns0:_x ns1:_c ns0:homepage ns1:homepage <http://www.cs.ccsu.edu/~neli> <http://www.cs.ccsu.edu/~neli> Merge these two nodes 10
Linked Open Data Cloud (2007) Over 500 million data chunks (RDF triples) with 120,000 links between them 11
Linked Data [http://lod-cloud.net ] For Linked Open Data cloud diagram in 2009 see http://lod-cloud.net/versions/2009-07-14/lod-cloud.pdf For Linked Open Data cloud diagram in 2011 see http://lod-cloud.net/versions/2011-09-19/lod-cloud.pdf For Linked Open Data cloud diagram in 2014 see http://lod-cloud.net/versions/2014-08-30/lod-cloud.pdf Note: All data in the LOD cloud is linked together and is publicly available. In August, 2016 the LOD cloud contained [http://stats.lod2.eu]: • 9,960 datasets • More than 85 billion facts • More than 800 million links 12
The Semantic Web is a REALITY Currently, the Semantic Web encompasses almost 10000 databases, >85 billion facts, > 800 million links. These are publicly available data, identifiable via URI and accessible via HTTP. Example: DBPedia -- Wikipedia for the Semantic Web, which can be used by both, humans and computers. For humans, information is returned as an HTML document, for computers – information is returned in machine understandable RDF format. The link http://dbpedia.org/resource/Central_Connecticut_State_University http://dbpedia.org/page/Central_Connecticut_State_University (returns the web page) http://dbpedia.org/data/Central_Connecticut_State_University (returns machine-understandable representation) 13
About: Central Connecticut State University An Entity of Type : Public university, from Named Graph : http://dbpedia.org, within Data Space : dbpedia.org Property Value •Central Connecticut State University is a regional, comprehensive public university in New Britain, Connecticut. Founded in 1849 as Connecticut Normal School, CCSU is Connecticut's oldest publicly funded university. CCSU is made up of four schools: the Ammon School of Arts & Science, the School of Business, the School of Education & Professional Studies, and the School of Engineering & Technology. Attended by over 11,000 students, 9,200 are undergraduates, and 2,000 are graduate students. It is part of the Connecticut State University System , which also oversees Eastern, Western, and Southern Connecticut State Universities. Together they have a student body of over 34,000. As a commuter school, more than half of students live off campus and ninety percent are in-state students. •dbr:Connecticut_State_University_System •dbr:National_Collegiate_Athletic_Association •dbr:Suburb •dbr:New_Britain,_Connecticut •dbr:United_States •4.7E7 •Central Connecticut State College •Connecticut Normal School •New Britain Normal School •Teachers College of Connecticut •Blue Devil •2094 (xsd:integer) •9771 (xsd:integer) •BlueandWhite dbo:abstract Central Connecticut State University is a regional, comprehensive public university in New Britain, Connecticut. Founded in 1849 as Connecticut Normal School, CCSU is Connecticut's oldest publicly funded university. CCSU is made up of four schools: the A dbo:affiliation dbo:athletics dbo:campus dbo:city dbo:country dbo:endowment dbo:formerName dbo:mascot dbo:numberOfPostgraduateStudents dbo:numberOfUndergraduateStudents dbo:officialSchoolColour 14
CCSU info as a machine-understandable document <?xml version="1.0" encoding="utf-8" ?><rdf:RDF rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:georss="http://www.georss.org/georss/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:ns7="http://www.w3.org/ns/prov#" xmlns:dbo="http://dbpedia.org/ontology/" xmlns:dct="http://purl.org/dc/terms/" > <rdf:Description rdf:about="http://dbpedia.org/resource/Yan_Klukowski__3"> <dbo:team rdf:resource="http://dbpedia.org/resource/Central_Connecticut_State_University" /> </rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/resource/Dan_Gaspar__8"> <dbo:team rdf:resource="http://dbpedia.org/resource/Central_Connecticut_State_University" /> </rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/resource/1999%E2%80%932000_Los_Angeles_Clippers_season"> <dbp:college rdf:resource="http://dbpedia.org/resource/Central_Connecticut_State_University" /> </rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/resource/List_of_Phi_Beta_Sigma_chapters"> <dbp:school rdf:resource="http://dbpedia.org/resource/Central_Connecticut_State_University" /> </rdf:Description> …….. xmlns:rdf="http://www.w3.org/1999/02/22- xmlns:dbp="http://dbpedia.org/property/" 15
Semantic Web Layer Cake Source: http://www.semanticfocus.com/blog/entry/title/introduction-to-the-semantic-web-vision-and-technologies-part-2- foundations/
XML (eXtended Markup Language) XML is a flexible text format that is used to structure, store, and transport data over the Web. Contrary to HTML, which is about displaying data, XML is about describing data, BUT there is no one standard way to describe the same data. Example: Consider the concept COURSE, and its instance CS462. HTML description XML description <H1> CS462: AI</H1> <course> <UL> <title> CS462: AI </title> <LI> CRV: 4185 <CRV> 4185 </CRV> <LI> Level: undergrad/grad <level> undergrad/grad </level> <LI> Professor: NZ, office hours … <Professor> <LI> Website: www.cs.ccsu.edu/~neli </UL> <office hours> …</office hours> <name> NZ </name> <Website> … </Website> </Professor> </course> 17
XML documents are labeled trees Course Professor Title Level CRN Name Web site 18
XML (contd.) XML documents are easily readable and understandable by humans, because their tags are familiar terms, but • XML lacks semantics, and • XML makes no commitment to ontological vocabulary, nor to ontological modelling , i.e. can not serve as knowledge representation language. Because XML is a universal meta markup language, the same term can be given different meanings by different sources (for example title can mean “book title” or “person title”) . To resolve such inconsistencies, the so-called namespaces are used. For example: xmlns:dc=“http://purl.org/dc/elements/1.1/” defines namespace dc (Dublin Core) and <dc:title>Artificial Intelligence</dc:title> suggests that term title refers to a book. xmlns:v=“http://www.w3.org/2006/vcard/” describes people, and <v:title>Doctor</v:title> suggests that term title refers to a person. 19
RDF (Resource Description Framework) • RDF is the foundation for representing and processing knowledge on the web. It is a graph-based data model, where knowledge is represented as a list of statements called triples. • Each triple has the form “subject, predicate, object”. Example: “Jones TEACHES Math101” • Each element of a triple (the resource) is identified by a URI. Example: <http://myUniv.edu/people/Jones> <http://myUniv.edu/terms/teaches> <http://myUniv.edu/courses/Math101> --- in N-triples format. RDF can be implemented in various ways (called serializations), one of which has XML- based syntax to support syntactic interoperability. Example: <rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:myUniv=“http:/myUniv.edu/terms/”> <rdf:Description rdf:about=“http://myUniv.edu/jones”> <myUniv:teaches> <rdf:Description rdf:about=“http://myUniv.edu/courses/Math101”> </rdf:Description> </rdf:Description> </rdf:RDF> 20
RDF statements are directed labeled graphs <http://www.math.ccsu/jones> <http://www.cs.ccsu.edu/~neli/univ.owl#teaches> <http://www.ccsu.edu/catalog/Math101> • RDF is provided with a model-theoretic semantics that defines the notion of entailment between two RDF statements. • RDF graphs are finite sets of RDF triples. • This types of graphs are very similar to semantic nets. 21
Another example Consider the following set of triples: { <?p1 foaf:name “Jones”>, <?p1 foaf:knows ?p2>, <?p1 myUniv:teaches ?c1>, <?p2 myUniv:studies ?c1>, <?p2 foaf:name “Bob”>, <?p2 foaf:mbox “Bob@mygmail.com”>, <?c1 rdf:type myUniv:course>, <?c1 foaf:name “Math101”>} where foaf : <http://xmlns.com/foaf/0.1/> rdf: <http://www.w3.org/1999/02/22-rdf-syntav-ns#type> “Jones” “Bob” foaf:name foaf:name foaf: knows foaf:mbox myUniv:teaches foaf:name myUniv:studies “Bob@mgmail.com” “Math101” rdf:type “myUniv:course” _:p1 _:p2 _:c1 22
RDF Schema (RDF Vocabulary Description Language) • RDF is a universal language that allow users to describe their own domains, but it does not make assumptions about any particular domain. RDF Schema defines the vocabulary, specifies object properties and their values, and describes the relations between objects. RDF Schema organizes this vocabulary in a typed class hierarchy. Example (for short, in N3 format, which is a superset of N-Triples; it allows us to define a URI prefix and identify entity URIs wrt a set of prefixes at the beginning of the document) @prefix univ: <http://www.cs.ccsu.edu/~neli/univ.owl> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . univ:professor rdfs:subClassOf univ:staff . univ:staff rdf: type rdfs: Class . univ: professor rdf:type rdfs:Class . univ:Jones rdf:type univ:professor . • • 23
RDF/RDFS Example teaches Jones Math101 RDF Professor TeachingAss RDFS Staff 24
RDF and RDFS Axiomatic Semantics • All language primitives are represented by constants, such as Resource, Class, Property, subClassOf, Literal, etc. A few predefined predicates are used to represent relations between constants, such as: – An RDF triple is represented as PropVal (P, R, V), where P is a property, R is a resource, and V is a value; – Predicate Type (R, T) states that resource R has the type T, and it is equivalent to PropVal (type, R, T). All classes are instances of Class and have the type Class, i.e Type (Class, Class), Type (Property, Class), Type (Resource, Class), etc. Resource is the most general class – every class and every property is a resource. Predicates in RDF statements are properties. • • • 25
RDF and RDFS Semantics (contd) In RDFS, we also have subclasses , subproperties, and constrains. • subClassOf is a property, i.e. Type(subClassOf, Property). • If class C is a subclass of class C’, then all instances of C are also instances of C’, i.e. PropVal (subClassOf, ?c, ?c’) (Type(?c, Class) & Type(?c’, Class) & ?x (Type (?x, ?c) Type (?x, ?c’))) • Property P is a subproperty of property P’, if P’(x, y) whenever P (x, y), i.e. Type (subPropertyOf, Property) PropVal (subPropertyOf, ?p, ?p’) (Type(?p, Property) & Type(?p’, Property) & ?r ?v (PropVal (?p, ?r, ?v) PropVal (?p’, ?r, ?v))) Every constraint resource is a resource, i.e. PropVal (subclassOf, ConstraintResourse, Resourse) Constraint properties are all properties that are also constraint resourses, i.e (Type (?cp, ConstraintProperty) (Type (?cp, ConstraintResource) & • • Type (?cp, Property)) 26
RDF and RDFS Semantics (contd) • domain and range are constraint properties, i.e. Type (domain, ConstraintProperty) Type (range, ConstraintProperty). Domain of a property is a set of all object to which P applies, i.e. PropVal (domain, ?p, ?d) ?x ?y (PropVal (?p, ?x, ?y) Type (?x, ?d)) Range of property P is the set of all values that P can take, i.e. PropVal (range, ?p, ?r) ?x ?y (PropVal (?p, ?x, ?y) Type (?y, ?r)) • • Given all these axioms, we can derive the following formulas: PropVal (domain, range, Property) PropVal (range, range, Class ) PropVal (domain, domain, Property) PropVal (range, domain, Class) Example. Given PropVal (subClassOf, Professor, Staff), PropVal (domain, teaches, Professor), PropVal (teaches, Jones, Math1) we can derive Type (Jones, Staff). 27
A Direct Inference System for RDF and RDFS • Based on rules of the form If E contains certain triples Then add to E certain triples, where E is a set of RDF triples. • Example rules (from W3C RDF recommendations): If E contains the triple (?x, ?p, ?y) Then E also contains the triple (?p, rdf : type, rdf : property) If E contains the triples (?u, rdfs : subClassOf, ?v) and (?v, rdfs : subClassOf, ?w) Then E also contains the triple (?u, rdfs : subClassOf, ?w) If E contains the triples (?x, rdf : type, ?u) and (?u, rdfs : subClassOf, ?v) Then E also contains the triple (?x, rdf : type, ?v) If E contains the triples (?x, ?p, ?y) and (?p, rdfs : range, ?u) Then E also contains the triple (?y, rdf : type, ?u) 28
How inference in RDF is different from inference in RDFS? Consider the following triples • myUni:Student1 rdf: type myUni: TeachingAssistant . • myUniv: TeachingAssistant rdfs: subClassOf myUniv: Staff . RDF inference will not return an answer to the query to retrieve all staff members, i.e. (?x , rdf : type, myUniv : Staff) because there is no triple matching this pattern. RDFS will return the instances of the TeachingAssistant class using the rule (called the “type propagation” rule) If E contains the triples (?x, rdf : type, ?u) and (?u, rdfs : subClassOf, ?v) Then E also contains the triple (?x, rdf : type, ?v) 29
Types of inferences in SW applications 1. Class membership: if x is an instance of class C, and C is a subclass of D, we want to infer that x is an instance of D. Equivalence of classes: If class A is equivalent to class B, and class B is equivalent of class C, then A is equivalent to C. Classification: if a property-value pair is declared to be a sufficient condition for membership in class A, then if individual x satisfies this condition, x must be an instance of A. Consistency: if x is declared to be an instance of class A where A B C, A D , and B D = , then the ontology is inconsistent because class A must be empty but instead x A. 2. 3. 4. 30
Multiple inheritance and RDFS Consider the rule If E contains the triples (?x, rdf : type, ?u) and (?u, rdfs : subClassOf, ?v) Then E also contains the triple (?x, rdf : type, ?v) Assume the following triples a. “Bob” rdf : type myUniv : TeachingAssistant . b. “Bob” rdf : type myUniv : Student . c. myUniv: TeachingAssistant rdfs : subClassOf myUniv: Staff . From a. and c. and the above rule will can derive “Bob” rdf : type myUni : Staff . The later may be inconsistent with b. if Staff and Student are supposed to be disjoint classes. But disjointness of classes cannot be expressed in RDFS. In RDFS, if ?A is subClassOf ?B, and ?A is subClassOf ?C, then any individual ?x that is a member of ?A will also be a member of ?B and ?C. That is, the range definitions in RDFS are not used to restrict the range of a property, but to infer the membership of the range. 31
What can we deduce in RDFS? In summary, the inference capabilities of RDFS are limited to the following : 1. Given the domain and the range of a property, we can deduce: • Class membership from the domain of a property. • Class membership from the range of a property. Example: given that “Course isTaughtBy Professor” and “Math101 isTaughtBy Jones”, we can derive that Math101 Professor. Given a class hierarchy, we can deduce superclass membership. Example: given that Professor Staff and Jones derive that Jones Staff. Given a property hierarchy, we can deduce new facts from subproperty relationships. Example: from teachAt “Jones teachAt CCSU” we can derive that “Jones employedBy CCSU” Course, and Jones 2. Professor, we can 3. emplyedBy and 32
What cannot we deduce in RDFS? 1. We can’t say that two classes are disjoint, i.e. we can define Student and Staff as subclasses to Person class, but can say that they are disjoint. Property range is defined globally for all classes, we can’t declare range restrictions that apply to some classes only, i.e. exceptions are not allowed. We can’t build Boolean combinations of classes. For example, we may want to declare a new class, person, which is disjoint union of classes male and female. Cardinality restrictions are not allowed. For example, we can’t say that a person has exactly two parents, or a class has exactly one instructor. We can’t declare a property to be inverse of another property, transitive, functional, etc. 2. 3. 4. 5. 33
SPARQL: a query language for RDF and RDFS SPARQL is based on RDF Turtle serialization and basic graph pattern matching algorithm. Example: PREFIX rdf : <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs : <http://www.w3.org/2000/01/rdf-schema#> SELECT ?c WHERE { ?c rdf : type rdfs:Class . } This query retrieves all classes. 34
Using SPARQL 35
What can we do with SPARQL? 1. Extract data as RDF subgraphs, URIs, blank nodes, etc. 2. Explore data via query for unknown relations. 3. Transform RDF data from one vocabulary into another 4. Construct new RDF graphs based on RDF query graphs 5. Update RDF graphs 6. Do logical entailment for RDF, RDFS, and OWL 7. Do federated queries over deferent SPARQL endpoints 36
Basic query 37
… and the result 1. Comes next 38
Ontologies and the Semantic Web “An ontology is an explicit, formal specification of a shared conceptualization. The term is borrowed from philosophy, where an ontology is a systematic account of existence. For AI systems, what ‘exists’ is that which can be represented.” T. Gruber, 1993. Ontologies are represented via Classes, Relationships, and Instances. Constrains can be imposed on relationships to define allowed values. There are several types of ontologies (see textbook, pp. 462 – 468: Internet shopping world example) • Upper (top-level) ontologies, representing concepts such as space, time, event, etc. that are universally valid. • Domain ontologies, representing concepts in a generic domain. • Task ontologies, representing concepts related to a particular task. • Application ontologies, representing specific application task-oriented domains. 41
An Ontology Example • Visit http://protege.stanford.edu/ to learn about creating ontologies. Source: http://www.sei.cmu.edu/isis/guide/gifs/fruit-ontology.gif 42
OWL: The Web Ontology Language The original OWL language, OWL 1, was intended to provide a richer expressiveness compared to RDFS which is why it was based on SHOIN(D) logic. More expressive power, however, may lead to undesirable computational properties which is why OWL 1 was designed in 3 different flavors to address different knowledge representation needs: 1. OWL Full: fully compatible with RDFS which is further extended with cardinality constraints and other means for maximum expressivity, BUT the language is undecidable. OWL DL: subset of OWL Full to allow for efficient reasoning, but is not fully compatible with RDFS. OWL Lite: subset of OWL DL that does not allow for enumerated classes, disjointness and arbitrary cardinality. 2. 3. The latest version of OWL, OWL 2, is based on SROIQ(D) logic. It also comes in different flavors: OWL EL, OWL RL, OWL QL, which are all subsets of OWL 2 DL, which turn is a subset of OWL 2 Full. OWL is based on the Open World Assumption, which states that the absence of information is not the reason to assume that this information is false. OWL does not rely on the Unique Name Assumption (which is the case with data bases), i.e. if two names are not explicitly stated to be different, they may refer to the same individual. 43
OWL RDF/RDFS relation rdfs: Resource rdfs:Class rdf:Property owl:Class owl:ObjectProperty owl:DatatypeProperty OWL uses RDF syntax. owl:Class, owl:DatatypeProperty, and owl:ObjectProperty are specializations of rdfs:Class and rdf:Property, respectively. 44
OWL 1 Syntax OWL 1 is based on the SHOIN(D) logic, which provides for the following expressiveness: The TBox defines subsumption relationships between classes (ex. C ⊑ D) The ABox contains facts about class membership (ex. C(a), C(b)), properties relations (ex. R(a, b)), equality and difference relations between individuals (ex. a = b, a b) The RBox defines subsumption/inclusion relationships between properties (ex. R ⊑ S), inverse properties (ex. R -), and transitivity properties (ex. R ⊑ + R) Class constructors: conjunction (C ⊓ D), disjunction (C ⊔ D), negation ( Property restrictions – universal ( R . C) and existential ( Number restrictions -- n R (maxCardinality) and n R (minCardinality) Examples: 1 hasChild In FOL: 1 hasChild In FOL: Closed classes (nominals} – {a} Datatypes C) R . C) ny hasChild(x, y) ny hasChild(x, y) 45
OWL 2 Syntax OWL 2 is based on the SROIQ(D) logic, which provides for the following additional expressiveness compared to OWL 1: The TBox allows also for: equivalence relationships between classes (ex. C and for a special class expression Self: n S . C New features: The ABox allows also for negated property relations (ex. The RBox may contain in addition to simple properties, inverse properties (ex. R -) and universal properties (U) and also allows for general inclusion (ex. R1 R2⊑ S), symmetry, reflexivity, irreflexivity and disjunctiveness of properties. D) S.Self. We can also state n S . C and R(a, b)) There are different syntax versions of OWL 2: Functional syntax (substitutes the abstract syntax of OWL 1) RDF/XML syntax (extends the existing OWL/RDF syntax) OWL/XML syntax (new XML serialization) Manchester syntax (machine readable intended for ontology editors) Turtle syntax (human readable) 46
OWL 2 Syntax OWL 2 is based on the SROIQ(D) logic, which provides for the following additional expressiveness compared to OWL 1: The TBox allows also for: equivalence relationships between classes (ex. C disjoint union (ex. Every person is either male or female BUT NOT both.) disjoint classes (ex. No one can be a member of any pair of classes from a given set – say, Faculty, Student, Staff) Self restriction: a special class expression Self: “Optimists are people who believe in themselves” Qualified cardinality restrictions n R . C and n R . C Examples: “Man with exactly three sons”, “Each student has at most one ID number”. The ABox allows also for negated data and object properties (ex. The RBox may contain in addition to simple properties, inverse properties (ex. R -) and universal properties (U) and also allows for general inclusion, property chains (ex. R1 R2⊑ S), symmetry, reflexivity, irreflexivity and disjunctiveness of properties. D) R.Self. Example: R(a, b)) 47
Want to build an OWL ontology yourself? Although some of OWL serializations are not very hard to use in an application-development setting, there are ontology editor applications that are easy to learn and much more efficient ontology development tools. My #1 choice is PROTÉGÉ, developed at Stanford University and freely available at http://protégé.stanford.edu. It comes is two versions: Web application Desktop application Both come with extensive documentation, including Ontology Development 101: A Guide to Creating Your First Ontology @ http://protegewiki.stanford.edu/wiki/Ontology101 48
n 49