780 likes | 802 Views
Describing Resources on the Web: The Resource Description Framework. Vassilis Christophides Dimitris Plexousakis Computer Science Department, University of Crete Institute for Computer Science - FORTH Heraklion, Crete http://www.ics.forth.gr/proj/isst/RDF. meta data. Introduction to Metadata.
E N D
Describing Resources on the Web: The Resource Description Framework Vassilis ChristophidesDimitris PlexousakisComputer Science Department, University of CreteInstitute for Computer Science - FORTHHeraklion, Cretehttp://www.ics.forth.gr/proj/isst/RDF
meta data Introduction to Metadata
What is the Problem? • 3.6 million Web sites • Five hundred million or more addressable pages on the Web • High consumer expectations conflicting with primitive tools and mechanisms • Uncertain quality, integrity, trust
The Information Landscape in the Web-era • The Web changes relationships among • authors • publishers • information intermediaries and distributors • users • Lower barriers to “publication” • rapid dissemination of information and ideas • less advantage to size or centralization • greatly expanded access • Manageability is reduced • resource discovery is chaotic • organization is haphazard • preservation is almost non-existent
The Web Information System vs. Traditional Libraries • Search systems are motivated by advertising • Index coverage is unpredictable and limited (1/3) • Too much recall, too little precision • Index spam abound • Resources (and their names) are volatile • What about versions, editions, back issues? • Archiving is presently unsolved • Authority and quality of service are spotty • Managing Access Rights is hard
Metadata: Higher Quality Web Information Services • Traditionally: • metadata has been understood as “Data about Data” • help to impose order on chaos • Example(s): • a library catalogue contains information (metadata) about publications (data) • a file system maintains permissions (metadata) about files (data) • Metadata describes other data • One application’s metadata is another application’s data • Metadata can itself be described by metadata (but that doesn’t make it meta-metadata) • Example: • Price lists (metadata) have expiration dates: metadata about metadata (It is still just metadata!!)
Metadata takes Many Forms resource document rights discovery administration management content security and archival rating authentication status products and database process control services schemas or description
Metadata exists for Almost Anything • People • Places • Objects • Concepts • Documents • Archives • Databases
Application: Item and Collection Cataloguing • Describing individual resources • documents, pages, images, audio files, etc. • Describing the content of collections • Web sites, databases, directories, etc. • Relationships among Resources • Tables of Content, chapters, images…. • Site Maps
Application: Resource Discovery • Search engines can better “understand” the contents of a particular page • More accurate searches • Additional information aids precision • Makes it possible to automate searches because less manual “weeding” is needed to process the search results
Application: Electronic Commerce Broker • Metadata can be used to encode information needed in all stages of electronic commerce • locating seller/buyer & product • searching “yellow pages” • agreeing on terms of sale • prices, terms of payment, contractual information • transactions • delivery mechanisms, dates, terms Market place Providers/Clients
place place service place Application: Intelligent Agents • Representation and sharing of knowledge • knowledge exchange • modeling • Communication • user-to-agent, agent-to-agent, agent-to-service • Resource discovery • gives web-roaming agents the ability to “understand” their environment
Application: Content Rating • Empowering users to select which kinds of web content they wish to see • Child Protection • W3C PICS (Platform for Internet Content Selection) working group • US Communications Decency Act of 1996 • simple metadata architecture • precursor to RDF
Application: Digital Signatures • These are key to building the “Web of Trust” • Required by • agents • electronic commerce • collaboration • RDF will become the preferred way to encode digital signatures on documents and on statements about documents
Other Applications • Privacy Preferences and Policies • describing a user’s willingness/ reluctance to disclose information about him/her-self • describing a site administrator’s desire to gather information about visiting users • Intellectual Property Rights • contractual terms related to usage and distribution rights to a document
Trusted Third Party (explicit HTTP GET) Associated With (in HTTP header) Embedded (eg META) (Meta)Data Transmission Methods
Metadata Assertions • The Web is “machine-readable” but not “machine-understandable” • Metadata is useful • A lot could be gained from structured description of pages, servers, search services, and other resources • Accommodate multiple varieties of metadata • Metadata requirements will evolve
A Plethora of Metadata Standards • Many metadata standards have evolved at different levels, and to meet different requirements... MICI
Interoperability Issues “Let’s talk English” “cat milk sat drank mat ” Standardisation ofcontent SemanticInteroperability “Here’s how to make a sentence” “Cat sat on mat. Drankmilk.” Standardisation ofform StructuralInteroperability “These are the rulesof grammar” “The cat sat on the mat.It drank some milk.” SyntacticInteroperability Standardisation ofexpression
Simplicity and interoperability Functions, features, and cool stuff Metadata Challenges • Many flavours of metadata • which one do I use? • Managing change • new varieties, and evolution of existing forms • Tension between functionality and simplicity, extensibility and interoperability
Towards Metadata for Community Webs • Group of people sharing a domain of discourse and a set of resources (e.g.,data, documents, services) and having some common interests • Commerce, Education, Health • Provide community-specific metadatafunctionality in order to create, administrate, and access resources • common semantic, structural, andsyntactic conventions for exchange of resource description information Community Webs Workplace Education Commerce Health
Commerce Home Pages Geo Library Community Webs Scientific Data Whatever... Museums Metadata Interoperability in Community Webs • Communities of expertise (not software vendors) are responsible for: • Semantics • Registration • Administration • Access management • Authority of data • Sharing and Distribution
Metadata Implementation Approaches • Harvesting metadata into a repository (database) • Distributed Database Search
HTML Query Harvester Repository XML Other types Dynamic document creation from database retrieve resource Harvesting Metadata into a Repository (database)
Query Z39.50 Server Z39.50 Server Z39.50 Server Z39.50 Gateway retrieve resource Distributed Database Search
RDF origins • W3C Metadata Activity 1997-2000 • PICS (Internet content selection) • Warwick Framework / Dublin Core • XML (XML Data, Channels etc) • MCF (Apple, Netscape) • URI specification for Web identifiers
RDF Objectives • Enables resource description communities to define their own semantics • We can disagree about semantics, but share infrastructure (syntax, query, editors) • Imposes structural constraints on the expression of various application metadata • for consistent encoding, exchange and processing of metadata on the Web • Metadata vocabularies can be developed without central coordination • Fine-grained mixing of diverse metadata • Signed RDF is the basis for trust • XML used for ‘serialisation syntax’
Advanced Knowledge Schemas (ontologies, thesauri) Heterogeneous resource descriptions Complexity and diversity of information resources <tag1> <tag2> <tag3> </tag1> Describing Community Resources using RDF
The Basic RDF Data Model • RDF: Resource Descriptions • Data Model: Directed Labeled Graphs • Nodes: Resources (URIs) or Literals • Edges: Properties – Attributes or Relationships • Statement: assertion of the form resource, property, value • Description: set of statements concerning a resource • XML syntax
Resource Statement The Basic RDF Data Model: Primitives Resource Property Value
URI:Vassilis Simple Example URI:Tutorial Author “Vassilis”
The notion of Resource • A resource is identified by a URI: • [absoluteURI | relativeURI] [“#” fragment-id] • The resource identified by a URI may be abstract • i.e. not network retrievable • Resource is distinct from entity resolved at any particular time • http://www.ics.forth.gr/RDF/ • From RFC 2396: Resource A resource can be anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources. The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity which corresponds to that mapping at any particular instance in time. Thus, a resource can remain constant even when its content---the entities to which it currently corresponds---changes over time, provided that the conceptual mapping is not changed in the process.
RDF Syntax • RDF Model defines a formal relationships among resources, properties and values • Syntax is required to... • Store instances of the model into files • Communicate files from one application to another • W3C XML eXtensible Markup Language <tag1> <tag2> <tag3> </tag1>
dc: dc: bib:Aff bib:Email bib:Name URI:FORTH “christop@ ics.forth.gr” “ICS-FORTH” “`Vassilis Christophides” RDF Model Example: Complex Values URI:Tutorial Title “RDF Presentation” Creator “Vassilis Christophides” <RDF xmlns = “http://www.w3.org/TR/WD-rdf-syntax#” xmlns:dc = “http://purl.org/dc/elements/1.0/”> <Description about = “URI:Tutorial”> <dc:Title> RDF Presentation </dc:Title> <dc:Creator> Vassilis Christophides </dc:Creator> </Description> </RDF>
RDF Syntax Example: Complex Values <RDF xmlns = “http://www.w3.org/TR/WD-rdf-syntax#” xmlns:dc = “http://purl.org/dc/elements/1.0/” xmlns:bib = “http://www.bib.org/persons#”> <Description about = “URI:Tutorial”> <dc:Title> RDF Presentation </dc:Title> <dc:Creator> <Description> <bib:Name> Vassilis Christophides </bib:Name> <bib:Email> christop@ics.forth.gr </bib:Email> <bib:Aff resource = “http://www.ics.forth.gr” /> </Description> </dc:Creator> </Description> </RDF> <Description bib:Name = “Vassilis Christophides” bib:Email = “christop@ics.forth.gr” > <bib:Aff resource = “http://www.ics.forth.gr” /> </Description>
URI:Tutorial dc: Title “RDF Presentation” admin:By dc: Creator “STEP” admin:On “01-01-01” admin:For bib:Aff bib:Email bib:Name “...” URI:FORTH “christop@ ics.forth.gr” “ICS-FORTH” “`Vassilis Christophides” RDF Model Example
Where do you stop? • The Basic RDF model & syntax provides enabling technology • Degree of metadata simplicity/complexity is a matter of: • Resource description communities needs, best-practice and experience • Organization/Institution’s Policy • Economics • Goals and requirements of implementation
Nodes are resources connected by named properties P1 R1 R2 P1 The degenerate case is an arc terminating in a fixed value “foo” R1 P1 P2 R1 R2 R3 P3 An RDF description consists of a directed graph of arbitrary complexity P4 P5 R4 R5 R6 P6 R7 P7 R8 The Basic RDF Data Model: In Brief
One Additional Concept: Container Values • Containers are collections • they allow grouping of resources (or literal values) • It is possible to make statements about the container (as a whole) or about its members individually • Different types of containers exist • Bags -- groups of things • Sequences -- ordered group of things • Alternates -- Alternate things/values • First value is the default • Must be at least one • Duplicate values are permitted • there is no mechanism to enforce unique value constraints • Syntactic shorthand provided (much like HTML lists)
Containers (continued) URI:Tutorial dc:Creator rdf:Type rdf:Seq rdf:_1 rdf:_2 “Vassilis Christophides” “Dimitris Plexousakis”
Containers (continued) URI:Tutorial dc:Creator dc:Creator “Vassilis Christophides” “Dimitris Plexousakis”
The Basic RDF Data Model: Formal Aspects • Statement := (predicate,subject,object) • Predicate is a resource • Subject is a resource • Object is either a resource or a literal • Object = Predicate(Subject) • A model is a set of statements • Formal model based on triples (Universal relation) • Example {author, “http://www.ics.forth.gr/proj/isst/RDF”, node} {name, node, “Vassilis Christophides” } {email, node, “christop@ics.forth.gr” }
Triples for Container Values: Example • Triples from the first example: {“http://www.ics.forth.gr/proj/isst/RDF”,dc:Creator,x} {x, rdf:_1, “Vassilis Christophides” } {x, rdf:_2, “Dimitris Plexousakis” } {x, rdf:type, rdf:Seq } • Triples from the second example: {“http://www.ics.forth.gr/proj/isst/RDF”,dc:Creator, “Vassilis Christophides”} {“http://www.w3.org/TR/REC-rdf-syntax”, dc:Creator, “Dimitris Plexousakis”}
Edge Labeled Directed Graphs (RDF) Vassilis creator affiliation RDFTutorial ICS-FORTH projects activities ISL C-Web (creator, RDFTutorial, Vassilis) (affiliation, Vassilis, ICS-FORTH) (activities, ICS-FORTH, ISL) (projects, ICS-FORTH, C-Web)
Node labeled Directed Graph (XML) root x element element 2 attribute foo bar element attribute attribute attribute baz href x y attribute 1 3 z <root> <foohref=“…” x=“1” /> <barx=“2” y=“3”> <bazz=“aaa”/> </bar> aaa
What can we Express in RDF? • RDF relies on a (edge labeled) directed graph model that can easily • extended by just adding more edges • combine multiple vocabularies, distinguished by their URIs • RDF provides a standard syntax to represent these graphs in XML • RDF Model can be thought of as a simplified XML Infoset • But RDF goes beyond XML syntactic issues • It allows to define semantic networks on the Web
Semantic Networks name Person String lives in isa creates Artist Artifact isa isa isa isa Painter Sculptor Painting Sculpture isa paints sculpts “a Person has a name and lives_in somewhere . Artists are persons, painters and sculptors are artists. An artist creates artifacts, (paintings or sculptures) a painterpaintspaintings and a sculptorsculptssculptures”
RDF Schema Definition: RDFS • Declaration of label vocabularies for description graph nodes & edges • Enables communities to share machine readable tokens and define human readable labels • Node labels (types) are defined as classes • Literal data types as defined by XML Schemas WG • Resource may have a specific ‘type’ property • Edge labels (predicates) are defined as properties of these classes • A resource of given type may have a given property (domain constraint) • A resource of given type may be the value of a given predicate (range constraint) • RDFS vocabularies expressible in the basic RDF model and syntax • RDFS vocabularies are also Web resources (and have URIs) and therefore can be described using RDF
Constructing and Using RDF schemas • RDFS Schema Vocabularies allows for • Specialization of both classes & properties (simple & multiple) • Multiple classification of resources under several classes • Unordered, optional, and multi-valued properties • Domain and rangepolymorphism of properties