160 likes | 271 Views
The RDF meta model: a closer look. Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared w/ RDF The Dublin Core Standard Metadata, ontology and information registry. Basic Ideas.
E N D
The RDF meta model: a closer look • Basic ideas of the RDF • Resource instance descriptions in the RDF format • Application-specific RDF schemas • Limitations of XML compared w/ RDF • The Dublin Core Standard • Metadata, ontology and information registry
Basic Ideas • It’s all about machine-understandability and automation • resource discovery in building and maintaining search engines • cataloging for certain Web sites or a digital library • sophisticated communication between intelligent software agents • Standardization at two levels • instance level: all resource instances can be described in a uniformed format • schema level: class, attribute and relationship definitions also in a uniformed format. • Conventional meta models only enforce the standardized format in the schema level, e.g. O.O., ER or other semantic models.
Resource instance descriptions • The RDF meta model contains the following three basic concepts • A resource can be anything describable using RDF. E.g., an entire web page, a whole collection of web pages, a web site. Even an object in the physical world such as a book. • A property is a specific aspect of a resource. It can be a characteristic that belongs to a resource, or a relationship that links one resource with another. • A statement is a piece of description about a particular resource in the RDF format. • All RDF keywords for the instance-level description start w/ “rdf:”
Resource instance descriptions (cont’d) • A statement about a resource instance has: • the resource’s identifier • one of the resource’s property (defined in an RDF schema) • the value for that property (can be either a literal, or a resource) <rdf:RDFxmlns:wc=”http://www.scit.wlv.ac.uk/~ex1253/wc/schema”> <rdf:Descriptionabout=”http://www.cnn.com/2000/HEALTH/cancer/12/06/ colon.cancer.ap/index.html”> <wc:Title>Cigarette smoking linked to colorectal cancer </wc:Title> </rdf:Description> </rdf:RDF>
Resource instance descriptions (cont’d) • A property can be a collection of elements • RDF provides three types of collection: <rdf:Bag>, <rdf:Seq> and <rdf:Alt> • Collection specification at the instance rather than the schema level <rdf:RDF xmlns:wc=”http://www.scit.wlv.ac.uk/~ex1253/wc/schema”> <rdf:Descriptionabout=”http://www.cnn.com/2000/HEALTH/cancer/12/06/ colon.cancer.ap/index.html”> <wc:Title> <rdf:Alt> <rdf:li xml:lang=”en”>Cigarette smoking … </rdf:li> <rdf:li xml:lang=”it”>……</rdf:li> </rdf:Alt> </wc:Title> </rdf:Description> </rdf:RDF>
Resource Schema Specification • A schema in RDF is comparable to that in the O.O. model (a set of class definitions) or the ER model (a set of entity specifications) • User can define classes as well as an inheritance hierarchy. • Attributes of classes are separately defined as ‘properties’ – a major difference from conventional modeling methods. • All RDF keywords for the schema-level description start w/ “rdfs:”
Resource Schema Specification: Classes • A class definition in RDF is more like a class declaration in O.O. languages. • Keywords: <rdfs:Class>, <rdfs:subClassOf>. • An example: <rdf:RDF> <rdfs:Classrdf:ID=”MedicalDocuments”> <rdfs:comment>The set of all medical related documents. </rdf:comment> <rdfs:subClassOfrdf:resource=”http://www.w3.org/2000/ 01/rdf-schema#Resource”/> </rdfs:Class> <rdfs:Classrdf:ID=”PatientRecords”> <rdfs:comment>The set of patients’ records. </rdf:comment> <rdfs:subClassOf rdf:resource=”#MedicalDocuments”/> </rdfs:Class> </rdf:RDF>
Resource Schema Specification: Properties • Unlike conventional modeling methods where attributes are subordinates of classes, the ‘Property’ concept in RDF is at the same level with the ‘Class’ concept. • Properties are linked w/ classes via the ‘domain’ construct. • Benefit: more flexibility and extensibility. • An example: …<rdfs:Propertyrdf:ID=”patientID”> <rdfs:domainrdf:resource=“#PatientRecords”/> <rdfs:range rdf:resource=“#PatientIDClass”/></rdfs:Property>… Comparable to: class PatientRecords : MedicalDocument {PatientIDClass patientID; }
Limitations of XML compared w/ RDF • XML provides a standardized syntax for interoperability purpose, not on the semantics. • an XML parser can be reused anywhere • the XML parser cannot understand the semantics behind the syntax • for two parties to communicate, they have to agree upon certain semantic aspects of the document <abc cd=“…”> … </abc> App1 XML Parser (encoder) XML Parser (decoder) App2
Limitations of XML (cont’d) • No fixed XML syntax to describe one fact: • DTD 1 <!ELEMENT Resource (property)> <!ATTLIST Resource id (#CDATA)> <!ELEMENT property (Value)> <!ATTLIST property name (#CDATA)> <!ELEMENT Value (#CDATA)> • XML Instance <Resource id=“http://www.cnn.com/…/”><property name=“Title”> <Value>Cigarette … </Value></property> </Resource> • DTD 2 <!ELEMENT property (Resource, Value)> <!ATTLIST property name (#CDATA)> <!ELEMENT Resource (#CDATA)> <!ELEMENT Value (#CDATA)> • XML Instance <property name=“Title”><Resource>http://www.cnn.com/…</Resource><Value>Cigarette … </Value> </property>
The Dublin Core Standard • The Dublin Core Metadata Intiative • A standardized conceptual schema for describing web resources • Dublin Core Elements (attributes of a web resource) • 15 pre-defined elements: Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Coverage, Rights. • Dublin Core also defines a set of keywords associated with each element, called ‘Dublin Core Qualifiers’, to make the element instances more specialized • E.g. a ‘Date’ element may be further refined as ‘Created Date’, ‘Valid Date’, ‘Available Date’, ‘Issued Date’ or ‘Modified Date’. The encoding scheme can be ‘DCMI period’ or ‘W3C-DTF’. • The Dublin Core standard can be specified as an RDF Schema
Metadata, Ontology and Information Registry • Ontology: • concept (class) graph about real world objects • Metadata: • conceptually, people are committed to this de facto definition - ‘structured data about data’ • practically, ‘metadata’ is used to denote both ‘data about data’ and ‘data about real world objects (the ontology)’ • Information Registry: • to free ourselves from the ambiguity of ‘metadata’, we can use ‘information registry’ to denote ‘structured data about data’. • Both the ontology and the information registry can be represented in the RDF format
... Ontology and Information Registry (cont’d) Dublin Core Schema Extended Schema All specified in the RDF format Resource Schema abstraction Ontology Information Registry described by abstraction Information Resources described by Real World Objects
Current Tasks • Make a decision about which medical ontology to use. See if it is already specified in the RDF format • Design the information registry’s structure: make extensions to existing schemas such as Dublin Core • Based on the medical ontology and the registry structure • design an algorithm for automatic resource registration • design an ontology-enhanced algorithm for content correlation
Conclusion • RDF • a uniform format for resource instance as well as schema description • Dublin Core • a conceptual schema about web resources • Ontology • data about real world objects: classes, inheritance, attribute, relationship • Information Registry • data about data: various aspects of on-line documents • Information Resource Schema • schema for the information registry • All three (the ontology, the registry and the schema) will be specified in the RDF format