390 likes | 575 Views
Web Information Systems Engineering. Flavius Frasincar flaviusf@win.tue.nl. Contents. What is a Web Information System (WIS)? WIS Features Problem: Data Management in WIS Solution: Model-Driven Methodology (with Tasks Separation) Methodologies for WIS: Strudel Methodology
E N D
Web Information Systems Engineering Flavius Frasincar flaviusf@win.tue.nl ISA
Contents • What is a Web Information System (WIS)? • WIS Features • Problem: Data Management in WIS • Solution: Model-Driven Methodology (with Tasks Separation) • Methodologies for WIS: • Strudel Methodology • Hera Methodology • Summary ISA
World Wide Web • 1990: Tim Berners Lee ( ) invents the World Wide Web • The Web success is based on: • hypermedia (link) nature: links allow for a natural and flexible access to information according to the associative nature of human mind • global availability • interoperability • simplicity • free etc. ISA
Web Information Systems (WISs) • 1998: Tomas Isakowitz at al. coined the term Web Information Systems for: “information systems that are based on Web technology” • WISs are different from traditional information systems as they “have the potential of reaching a wider audience” through different platforms • There is an even increased need to integrate data as the data sources are distributed over the Web and they are possibly heterogeneous ISA
Three Generations of WISs • First Generation: are based on hand-crafted HTML • Difficult to maintain (update) • Second generation: generate HTML on demand by automatically filling templates • Data is machine readable/transformable • Difficult to make the data machine understandable • Third generation: Semantic Web Information Systems (SWISs) are WISs based on Semantic Web technology (RDF, OWL etc.) • Data is machine understandable ISA
Present the Deep Web • Deep Web vs. Surface Web: • 500 times larger • 1000 times better quality ISA
WIS Features • Data-intensive: integrate data from multiple heterogeneous sources • Pervasive: support different platforms e.g. network (T1, 128K, 56K), display (PC, Palm, WAP Phone) • User Adaptable:consider user’s preferences and user’s state of mind while interacting with the system • Flexible: support semistructured data • Automatic: need little or no human intervention • User interactive: e.g. online shops (Amazon) ISA
Problem: Data Management • WIS are hard to specify and implement • Methodologies exist for manual WIS design but few of them target automation • Difficult tasks to perform: • Multiplatform support • Automatic updates • Automatic site reconstruction (WIS Adaptation) • Optimize WIS performance (WIS Optimization) • Enforce WIS integrity constraints (WIS Analysis) • Achieve flexibility, extensibility etc. ISA
Semistructured Data • It is characterized by: • Irregular structure: missing or additional attributes, multiple attributes • Few type constraints: attributes with different types in different objects, heterogeneous collections • Rapidly evolving schema or missing schema • It is typically modeled by a DLG (Directed Labeled Graph) • Examples: HTML, XML, RDF, LaTeX Bib etc . ISA
Solution: Tasks Separation • Isolate and automatecommon tasks for WIS design: • Choose and access the data (data integration and retrieval) to be presented • Design the navigational structure for this data • Design the visual aspects of the presentation • Use a model-driven approach for task specification (the fairy says it brings “wisdom” [theory], “richness”[money], and “beauty” [judge it yourself] – Stefano Ceri) ISA
WIS Presentation Generation Srategies • Static (eager approach): presentations are materialized completely, each page is precomputed • Dynamic or On-demand (lazy approach): after each link “click” the next page to be presented is computed ISA
Methodologies • Dexter-based: HDM (Hypermedia Design Method) • ER-based: RMM (Relationship Management Methodology) • OMT-based: OOHDM • UML-based: OO-H (Conallen), UWE (UML-based Web Engineering),W2000 (HDM extension) • RDF-based: XWMF (eXtensible Web Modeling Framework), Hera • Other: Strudel, Araneus, WebML (Web Modeling Language), Autoweb, Trellis, XAHM (XML-based Adaptive Hypermedia Model), WSDM, W3DT etc. ISA
Strudel Methodology http://www.research.att.com/~mff/strudel AT&T ISA
Input Data <publications> <pub id=pub1> <title>Declarative spec…</title> <author>Mary Fernandez</author> <author>Dan Suciu</author> <year>2000</year> <journal>VLDB</journal> <abstract>Strudel is a …</abstract> <category>Languages</category> <category>Methods</category> … </pub1> … <pub id=pub2> <title> Catching the …</ title> <author>Mary Fernandez</author> <author> Daniela Florescu </author> <year>1998 </year> <booktitle> SIGMOD </booktitle> <abstract> The Strudel …</abstract> <category>WIS</category> … </pub2> </publications> ISA
Semistructured Data Model Direct Labeled Graph (DLG) ISA
STRUQL(Site TRansformation Und Query Language) where Root”publications”r, r”pub” x, xl v {where l=“year” link YearPage(v) ”year” v, YearPage(v) ”paperPage” x, RootPage() ”yearPage” YearPage(v) collect RootPage{RootPage()}, YearPage{YearPage(v)} } … ISA
Site Graph ISA
STRUDEL Template Language • PaperPage collection: • <i> • <sif booktitle> • <sfmt booktitle> • <selse> • <sfmt journal> • </sif> • </i><br> • <sfor p in author> • <sfmt @p>, • </sfor><br> • <sfmt year><br> • RootPage collection: <html> <sfor p in yearPage order=ascend key=year> <sfmt @p link=@p.year> </sfor> </html> • YearPage collection: <h1><sfmt year></h1> <ul> <sfor p in paperPage> <li><sfmt @p></li> </sfor> </ul> ISA
STRUDEL +/- + : Tasks separation (content and presentation) Declarative specifications (enables presentation content adaptation) Verification of integrity constraints (e.g. “All paper pages are reachable from RootPage”) • : Intermixes schema and content defintion in the data graph Does not separate navigation from visual details of the presentation Does not use standard technologies ISA
Hera Methodology http://wwwis.win.tue.nl/~hera TU/e ISA
HeraArchitecture ISA
Hera Presentation Methodology Conceptual Design Conceptual Model Transformation Adaptation Design Application Design Application Model Transformation Presentation Design Presentation Model ISA
Conceptual Model (CM) • Provides a uniform semantic view over different data sources that are integrated within a given Web application • Consists of hierarchies of concepts relevant within the given domain • Concept relationships are: • Attributerelationships: refer to literal values that characterize a concept • Referencerelationships: refer to other concepts ISA
Example: CM ISA
Example: CM in RDF/XML <rdfs:Class rdf:ID="Creator"/> <rdfs:Class rdf:ID="Painter"> <rdfs:subClassOf rdf:resource="#Creator"/> </rdfs:Class> <rdf:Property rdf:ID="creates" sys:cardinality="multiple" sys:inverse="created_by"> <rdfs:domain rdf:resource="#Creator"/> <rdfs:range rdf:resource="#Artifact"/> </rdf:Property> <rdfs:Class rdf:ID="Artifact"/> <rdfs:Class rdf:ID="Painting"> <rdfs:subClassOf rdf:resource="#Artifact"/> </rdfs:Class> <rdf:Property rdf:ID="year"> <rdfs:domain rdf:resource="#Artifact"/> <rdfs:range rdf:resource=“#Integer"/> </rdf:Property> <rdf:Property rdf:ID="picture"> <rdfs:domain rdf:resource="#Painting"/> <rdfs:range rdf:resource=“#Image"/> </rdf:Property> ISA
Application Model (AM) • Captures the logical (navigational) aspects of the presentation • Based on the concept of slice which contains attributes and possibly other slices • A slice is a meaningful presentation unit • A slice is associated to a concept from CM • Slice relationships are: • Aggregation relationships: embed a set of slices (abstraction for index, tour, indexed guided tour etc). • Reference relationships: link abstraction with an anchor specified ISA
Example: AM ISA
Example: AM in RDF/XML <rdfs:Class rdf:ID="Slice.painting.main" slice:owner="CM #Painting"> <rdfs:subClassOf rdf:resource="#Slice"/> </rdfs:Class> <rdf:Property rdf:ID="slice-ref"> <slice:prop-ref rdf:resource="CM #ex_by"/> <rdfs:domain rdf:resource="#S.t.main"/> <rdfs:range rdf:resource="#S.p.picture"/> </rdf:Property> <rdf:Property rdf:ID=“link_1"> <rdfs:subPropertyOf rdf:resource =“#link”> <rdfs:domain rdf:resource="# S.p.picture"/> <rdfs:range rdf:resource="#S.p.main"/> </rdf:Property> <rdfs:Class rdf:ID="Slice.technique.main" slice:owner=“CM#Technique" slice:main="Yes"> <rdfs:subClassOf rdf:resource=“#Slice"/> </rdfs:Class> <rdfs:Class rdf:ID="S.painting.picture" slice:owner=“CM#Painting" slice:attr-ref=“CM#picture"> <rdfs:subClassOf rdf:resource="#Slice"/> </rdfs:Class> <rdf:Property rdf:ID="media"> <rdfs:domain rdf:resource="#S.p.picture"/> <rdfs:range rdf:resource=“#Image"/> </rdf:Property> ISA
Adaptation • Captures two kinds of adaptation • Adaptability takes into account the device capabilities and user preferences (UAProf = User Agent Profile) • Adaptivity means that the presentation changes itself according to the “state of the user’s mind” while being browsed (UM = User Model) • Adaptation based on conditioning the appearance of slices using UAProf and/or UM • Adaptivity uses AHAM (Adaptive Hypermedia Application Model)update rules for updating UM ISA
Presentation Model • Defines the physical appearance of the presentation • Based on the concept of region which contains attributes and possibly other regions: • Each region has a rectangular area associated • Slices are translated to regions, one slice can be mapped to several regions • Slice relationships are materialized with: • Navigational relationships • Spatial relationships • Temporal relationships ISA
Presentation in Browsers WML SMIL HTML Wireless Markup Language Synchronized Multimedia Integration Language HyperText Markup Language ISA
Implementation • Models are represented in RDF and they are serialized in RDF/XML • User Agent Profile (UAProf): a Composite Capability/Preference Profiles (CC/PP) vocabulary to model device capabilities and user preferences • XSLT processor for transforming between different model instances (stylesheet-based transformation) • Xalan (XSLT 1.0) • Saxon (XSLT 2.0): multiple output files support ISA
Data Transformations • Step 0: Preparation • Substep 0.1: Application Model Unfolding creates the skeleton of an AM instance • Substep 0.2: Application Model Adaptation adds slice visibility conditions to the previous skeleton • Substep 0.3: Main TransformationSpecification Generation builds the specification for the next step • Step 1: Main Transformation populates the AM with the input CM instance • Step 2: Presentation Generation produces code for different browsers (HTML, WML, SMIL) ISA
Hera +/- + : Tasks separation (content, navigation, and presentation) Model-based specifications (enables presentation content adaptation) Uses standard technology: RDF, RDF/XML, XSLT • (Future Work): Specifications are semi-formal (difficult to check integrity constraints) Does not (yet) support user interaction ISA
Summary • What is a Web Information System (WIS) • Features of WIS: data intensive, pervasive etc. • Design methodologies for WIS: • Strudel (from industry) • Hera (from university) • Model-based approach for WIS design • WIS design tasks separation: • Data Selection • Navigation • Presentation ISA