240 likes | 647 Views
Tiddler: Customised publishing based on XML profiles and XML data sources . François Paradis , C é cile Paris, Anne-Marie Vercoustre, Stephen Wan, Ross Wilkinson, MingFang Wu. CSIRO Mathematical and Information Sciences. Outline. Motivation Examples Current approaches Our approach
E N D
Tiddler: Customised publishing based on XML profiles and XML data sources François Paradis, Cécile Paris, Anne-Marie Vercoustre, Stephen Wan, Ross Wilkinson, MingFang Wu CSIRO Mathematical and Information Sciences
Outline • Motivation • Examples • Current approaches • Our approach • How it works? • Analysis and Conclusion
Motivation: Why customised publishing? • Too much information: people want less information but more relevant to their need, knowledge, or task • On different devices at different times: paper, Web, WAP • (To build customer relationship)
Examples • Customised Travel Guides • Depending on who (preferences), where to go , when to go, • Depending on when/where to use it • Corporate brochures • Depending on who you are and your current interest(s)
Current techniques • Distinct versions of manually crafted documents: one for printing, one for the Web. No personalisation Word -> HTML; Latex -> HTML; HTML-> WAP • Information Retrieval: personalisation through queries, synthesis of the results; no much coherence • Document generation from database queries and different stylesheets: coherence but not high level semantic of the resulting document. Limited type of sources • Document generation using NL techniques: relies on theavailability of knowledge base in appropriate format
Tiddler approach • Exploits both language generation and IR-document synthesis approaches: • Coherence preserved • Wide variety of data sources (including web pages) accessible • Dynamically plan documents • Customise information using user models • Generate documents for multiple media types (Paper, Palm Pilots, Web browsers, Mobile Phones)
Data Sources eg. Databases, Web User Model Information Need Media Need System Architecture NORFOLK Virtual Document Planner Content Planner Discourse Rules Presentation Planner Surface Generator Customised Documents
User Model • Include • preferences, • information need, • Context (device) • historic • Collected via a G.U.I. Interface • Used to: • customise information to user • determine layout and content detail depending on media • encapsulate some Users’ Goal • Goal is about information need • Virtual Document Planner resolves goal using Planning techniques
Input: User Model • Name: Zoe • Medium: Palm Pilot • Destination: Melbourne • Date: 1 June-15 June 2001 • Activities: Cycling, Opera, Major Mitchell • Travel Information: Accommodation (backpacker)
XML Representation <usermodel xml:space="preserve" id="Zoe"> <name>Zoe</name> <destination> Melbourne</destination> <date> <start> <day>01</day> <month>June</month> <year> 2001</year> </start> <end> <day>15</day> <month>June</month> <year>2001</year> </end> </date> <wants> <activities>Cycling, Opera, Major Mitchell </activities> <accommodation> <accom_range>Backpacker</accom_range> </accommodation> <events/> </wants> <medium>palm-pilot</medium> …. </usermodel>
GeneralHotelsTo DoContacts Facts at a glance Population: 3.3 million Country: Australia Time Zone: GMT/UTC plus 10 hours Telephone Area Code: 03 Events Major Mitchell Output: Palm Pilot Version
Virtual Document Planner:Overview 1 The Virtual Document Planner: • uses Planning Techniques: • Goal achieved by finding subgoals that satisfy it • Subgoals are linked by rhetorical relations • Subgoals satisfied by: • other decomposable subgoals • primitive subgoals
Virtual Document Planner:Overview 2 The Virtual Document Planner: • produces a branching tree structure: • Node = information need goal • Nodes in branches = subgoals • Nodes linked by rhetorical relations • Subgoals and Goals represent: • content selection • presentation decisions
Tree for Zoe Example Enablement Preparation Background Title, Source Joint Further Contact General Information Joint Hotels Opera Cycling Major Mitchell
Virtual Document Planner: Sub-stages Three substages: • The Content Planner • The Presentation Planner • The Surface Generator
Virtual Document Planner: Sub-stage 1 The Content Planner: • uses Goal Planning • produces a tree structure • nodes = document content • Branches = rhetorical relations that may be realised with discourse markers
Virtual Document Planner: Sub-stage 2 The Presentation Planner: • Leaves of the tree = chosen content • Leaves expanded with layout mark-up of document • Mark-up depends on document organisation • Customised for particular media type.
Virtual Document Planner: Sub-stage 3 The Surface Generator: • Dependent on medium • Content and layout mark-up are mapped to: • text • XML • HTML • WML • Natural Language • graphics • pictures • tables • lists
Data Sources • Norfolk technology: • provides interface between: • Virtual Document Planner • Data sources • Data Sources originate from: • corporate data bases • existing web pages of known layout (wrapping) • Data Sources can be: • static: Norfolk retrieves content in advance -> XML • dynamic: Norfolk retrieves content as needed by Virtual Document Planner
Why are Dynamic Documents useful? A document can: • be composed using most up-to-date information • customise information to user • tailor content to particular query • tailored to a particular media
What are the limitations of current dynamic pages? Dynamic pages are often: • statically planned with templates and stylesheets • Templates grow exponentially in number as document becomes more flexible • represented in program language code • makes maintenance more difficult • limited to filtering at document level for customisation • required to maintain separate templates for different media
Conclusions (1) Tiddler Advantages: • Easier to maintain because • Documents use goal planning, not template based • Document Rules not in a program language code • Customisation filters and uses relevant information from parts of documents • Information can be gathered from multiple sources • Documents for different media are generated from the same document skeleton • Only need to update the skeleton
Conclusions (2) Future Work: - Reasoning about the discourse to provide feedback/explanations - Dynamic and complex user model to deal with history of information delivery - Complex user model to build customer relationship