150 likes | 316 Views
Community Webs (C-Web): Functionality and Architecture Issues V. Christophides Computer Science Department, University of Crete Institute for Computer Science - FORTH Heraklion, Crete. C-Web. workplace. education. commerce. health. What is C-Web ?.
E N D
Community Webs (C-Web): Functionality andArchitecture IssuesV. ChristophidesComputer Science Department, University of CreteInstitute for Computer Science - FORTHHeraklion, Crete
C-Web workplace education commerce health What is C-Web ? • Set-upmethodologies and infrastructure for fast deployment and easy management of knowledge-intensive Web applications incommunitiesrequiring: • effective knowledge assimilation, elicitation, ... • efficient query answering
Virtual XML Warehouse Files Documents Web Databases C-Web Main Idea: Virtual XML Warehouse • Main Goal: to provide a generic platform for describing, organizing and querying various XML resources according to a concept taxonomy shared by a specific community Knowledge Single Point of Access
C-Web Objectives • Reuse existing knowledge structures (e.g. ontologies, thesauri) • Integrate easily heterogeneous XML resources (e.g. data, documents) • Provide an intelligent information access (i.e. conceptual querying and browsing) • Support collaboration facilities & expertise management (e.g. annotations) • Enable automatic generation of new information resources (e.g. e-books)
C-Web Functionality: Current Status • Support for creating community conceptual models • integration of existing ontologies and thesauri • definition different viewpoints • Support for describing and integrating resources • resource content description metadata (CDM) • resource structure mapping metadata (SMM) • Support for conceptual browsing and querying • High level property-centric queries to resources • Querying both conceptual schemata and related instances • Support for collaborative resource annotation • Support for intelligent information publishing
The main C-Web Requirement: Interoperability • Heterogeneity is not a drawback, but a feature of autonomous information resources in large scale distributed systems • Interoperability: the ability to uniformly share, interpret and manipulate data and documents from heterogeneous resources Semantic Structure vocabularies Syntactic viewpoints contexts query language dialects abstraction & aggregation details data formats Functional System domain & data models transaction processing security policies communication protocols
BT C-Web Design Principle: Repository Independence Domain Model Pointillism Museum Artifact Artist Neo-Impressionism Conceptual Impressionism Fine-Art Archeological Painting Sculpture Sculptor Painter Source 1: XML enabled DBMS Source 2: XML Repository <elementType name=”ArtWork"> <sequence> <elementTypeRef name="Title" minOccur="1"/> <elementTypeRef name=”Creator" minOccur="1"/> </sequence>…. </elementType> <!ELEMENT MusArtifact(Name, Event+)> <!ATTLIST MusArtifact material CDATA #IMPLIED size CDATA#IMPLIED> <!ELEMENT Event (Person+,Place,Date)> <!ATTLIST Event nature (creation|acquisition)> <!ELEMENT Person (Name, Nation, Life?> ... Logical XML SQL XSQL Servlet XQL Xpath Servlet Physical
C-Web & Related W3C Standards • Semantic Interoperability: Content Description & Metadata Standards • ontologies (e.g. ICOM/CIDOC), thesauri (e.g., ULAN, TGN, AAT), metadata element sets (e.g. CIMI/Aquarelle Z39.50 profile) • Resource Description Framework (RDF) for expressing semantics • Structural Interoperability: Schema languages for specifying logical structure of Web resources • DTDs, XML Schema • SyntacticInteroperability: Markup languages for exchanging (semi-) structured data over the Web • XML, XLL, ... • FunctionalInteroperability: Data Manipulation languages for (semi-) structured data over the Web • XPath, XQL, XSL, ...
From RDF Schemata to XML resources r rdfs:Literal d artist:Sculpture artist:sculpts artist:Artist d RDF schema r d s s r artist:material artist:Painter artist:Sculptor artist:lives_in artist:Material t t t RDF/XML metadata #August_Rodin artist::sculpts Paris #The Burghers of Calais Iron artist:lives_in artist:sculpts #The Gates of Hell artist::material about www.artist.gr/august_rodin s : rdfs:subclassOf XML Resources <ARTIST> <NAME>August Rodin <LIVES>Paris <WORK> <TITLE>The Gates of Hell <MATERIAL> Iron …….. t : rdf:type d: rdfs:domain r: rdfs:range
Artist Name Lives Work Title Style Material XML Resources and C-Web Metadata XML Structure C-Web Schema of period Period title Museum Object String isa has_style Artifact Style Natural Object Material consists of <rdf:RDF xmlns:rdf="...#” xmlns:rdfs="...#" xmlns:s=”mycweb.forth.gr/...#"> <rdf:Description about=”www.artist.gr"> <s:mappings> <rdf:Bag> <li><Description about=”Artist.Work.Title”> <s:map rdf:resource=“s:#title”/></li> <li><Description about=”Artist.Work.Material”> <s:map rdf:resource=“s:#Material”/></li> … </rdf:Bag> </s:mappings> <ARTIST> <NAME>August Rodin <LIVES>Paris <WORK> <TITLE>The Gates of Hell <MATERIAL> Iron …….. <ARTIST> <NAME>August Rodin <LIVES>Paris <WORK> <TITLE>The Gates of Hell <MATERIAL> Iron …….. C-Web Resource Description Interface <ARTIST> <NAME>August Rodin <LIVES>Paris <WORK> <TITLE>The Gates of Hell <MATERIAL> Iron ……..
The C-Web Architecture Artist Painting Client Tier Museum Query Browsing Interface Painter http Virtual Document Render Schema Editor RDF/XML XML/XSL CWEB/APP Server Middleware APIs Session Manager Logical Middle Tier Metadata Store RDF/XML Loader XML/XSL Processor Query Engine URL Resolver RDF/XML XML XML XML Resources Artist http XML Wrapper URL XML enabled DBMS Well-formed Other docs XML docs Resource Description Interface Resource Annotation Interface on the Intranet on the Web e.g. mails, news, reports
C-Web Middleware: Main Features • Genericity: capture any XML structure (various DTDs), any form of XML semantics (DTDs, XML Schema), any XML access interface/protocol (XQL, XLL) • Scalability: w.r.t. the volume of XML resources, the number of XML repositories, network and server load, etc. • Extensibility: evolution of XML resources semantics and structure does not affect the main processing components and interfaces • Openness: rely on standards & APIs allowing to plug and play the same components & services in various applications, domains, etc.
The C-Web Architecture: Pending Issues • Schema Editor: Standalone application or client of the Middleware? • Where are stored large thesauri/ontology before their integration? • Who is responsible for Schema Validation (from scratch vs. integrated)? • What communication protocol we need with the C-Web Middleware? • Resource Description Interface: Loose or Tight coupled with the Middleware? • How C-Web Schema browsing/querying is implemented? • Where we can find the XML DTDs/Schemata of resources? • Who is responsible for Resource Description Validation? • What communication protocol we need with the C-Web Middleware? • Metadata Store: What persistence support we need? • What is an efficient RDF storage model (indexing & clustering)? • Did we also need to support updates/versions (versioning model)? • What are the authentication/security policies (RDF/XML with signatures)? • What is the result form of a C-Web query (triples, statements or objects)?
C-Web Communication Protocol: 3 Alternatives Query Client Server Reply Synchronous: a blocking query waits for an expected reply Query Handle Client Server Next Reply Next Reply Server maintains state; replies sent individually when requested Subscribe Client Server Reply Reply Reply Asynchronous: a nonblocking subscribe results in replies
Towards a C-Web Physical Architecture Query Browsing Interface Resource Description Interface Schema Editor RDF/XML Schema RDF/XML Descriptions RDF-QL Internet CWEB/APP Server Session Manager RDF/XML Parser Loader Persistent Namespace Service RDF/XML Query Engine Ethernet NM3 C-Web Schema & Instances NM1 Ontologies Metadata Store NM2 Thesauri Metadata Store Metadata Store