1 / 63

Marcos André Gonçalves Doctoral defense Virginia Tech, Blacksburg, VA 24061 USA

Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications. Marcos André Gonçalves Doctoral defense Virginia Tech, Blacksburg, VA 24061 USA. Acknowledgments. Funding: CAPES, NSF, AOL Collaborators

katharynj
Download Presentation

Marcos André Gonçalves Doctoral defense Virginia Tech, Blacksburg, VA 24061 USA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications Marcos André Gonçalves Doctoral defense Virginia Tech, Blacksburg, VA 24061 USA

  2. Acknowledgments • Funding: CAPES, NSF, AOL • Collaborators Pavel Calado, Lilian Cassell, Marco Cristo, Patrick Fan, Ed Fox, Robert France, Filip Jagodzinski, Rohit Kelapure, Neill Kipp, Aaron Krowne, Alberto Laender, Claudia Medeiros, Naren Ramakrishnan, Berthier Ribeiro-Neto, Rao Shen, Hussein Suleman, Ricardo Torres, Layne Watson, Baoping Zhang, Qinwei Zhu, …

  3. Publications and Accomplishments • Book Chapters • 4 published + 1 in press • Journal/Magazine papers • 8 published + 1 under revision + 1 accepted • Conference/Workshop papers • 25 published • Other publications (poster and demo papers) • 4 published • Awards • 3 (Lewis Trustee Award, AOL-CIT Fellowship– Honorable Mention, JCDL’04 Best Student Paper) • Helped supervise three Masters students

  4. Outline • Motivation: the problem • Hypotheses and research questions • Part 1:Theory • 5S: introduction, formal definitions • The formal ontology • Part 2: Tools/Applications • Language • Visualization • Generation • Logging • Part 3: Quality • Conclusions, Future Work

  5. Motivation • Digital Libraries (DLs): what are they?? • No definitional consensus • Conflicting views • Makes interoperability a hard problem • DLs are not benefiting from formal theories as are other CS fields: DB, IR, PL, etc. • DL construction: difficult, ad-hoc, lack of support for tailoring/customization • Conceptual modeling, requirements analysis, and methodological approaches are rarely supported in DL development. • Lack of specific DL models, formalisms, languages

  6. Hypotheses • A formal theory for DLs can be built based on 5S. • The formalization can serve as a basis for modeling and building high-quality DLs.

  7. Research Questions 1. Can we formally elaborate 5S? 2. How can we use 5S to formally describe digital libraries? 3. What are the fundamental relationships among the Ss and high-level DL concepts? 4. How can we allow digital librarians to easily express those relationships? 5. Which are the fundamental quality properties of a DL? Can we use the formalized DL framework to characterize those properties? 6. Where in the life cycle of digital libraries can key aspects of quality be measured and how?

  8. Outline • Motivation: the problem • Hypotheses and research questions • Part 1:Theory • 5S: introduction, formal definitions • The formal ontology • Part 2: Tools/Applications • Language • Visualization • Generation • Logging • Part 3: Quality • Conclusions, Future Work

  9. Informal 5S Definitions: DLs are complex systems that • help satisfy info needs of users (societies) • provide info services (scenarios) • organize info in usable ways (structures) • present info in usable ways (spaces) • communicate info with users (streams)

  10. 5Ss

  11. 5S and DL formal definitions and compositions (April 2004 TOIS)

  12. Glossary: Concepts in the Minimal DL and Representing Symbols

  13. 5S Dynamic / Active Static / Passive

  14. Outline • Motivation: the problem • Hypotheses and research questions • Part 1:Theory • 5S: introduction, formal definitions • The formal ontology • Part 2: Tools/Applications • Language • Visualization • Generation • Logging • Part 3: Quality • Conclusions, Future Work

  15. Digital Library Formal Ontology

  16. Ontology: Applications • Expand definition of minimal DL by characterizing • typical DL services • in the context of “employs” and “produces” relationships • Use characterization to: • reason about how DL services can be built from other DL components • as well as be composed with other services through extension or reuse

  17. Ontology: Applications

  18. Ontology: Taxonomy of Services Infrastructure Services Information Satisfaction Services Repository-Building Add Value Creational Preservational Acquiring Authoring Cataloging Crawling (focused) Describing Digitizing Harvesting Submitting Conserving Converting Copying/Replicating Translating (format) Annotating Classifying Clustering Evaluating Extracting Indexing Linking Logging Measuring Rating Reviewing (peer) Surveying Training (classifier) Translating Visualizing Binding Browsing Customizing Disseminating Expanding(query) Filtering Recommending Requesting Searching

  19. Composition of key infrastructure services

  20. Composition of additional services

  21. Outline • Motivation: the problem • Hypotheses and research questions • Part 1:Theory • 5S: introduction, formal definitions • The formal ontology • Part 2: Tools/Applications • Language • Visualization • Generation • Logging • Part 3: Quality • Conclusions, Future Work

  22. Approach

  23. Part 2: Tools/Applications

  24. Outline • Motivation: the problem • Hypotheses and research questions • Part 1:Theory • 5S: introduction, formal definitions • The formal ontology • Part 2: Tools/Applications • Language • Visualization • Generation • Logging • Part 3: Quality • Conclusions, Future Work

  25. 5SL: a DL Modeling language • Domain specific languages • Address a particular class of problems by offering specific abstractions and notations for the domain at hand • Advantages: domain-specific analysis, program management, visualization, testing, maintenance, modeling, and rapid prototyping. • XML-based realization of 5S • Interoperability • Use of many standard sub-languages (e.g., MIME types, XML Schemas, UML notations)

  26. 5SL – The Minimal DL Metamodel

  27. Example of Document declaration in the Structures Model Example of Actors declaration in the Societies Model Example of Service declaration in the Scenario Model <Society> <Actor> <Community name='Patron‘/> <Attribute name='name‘ type='String'/> <Attribute name='ID‘ type='Integer'/> </Community> <Community name='Student'> <Service>Converting</Service> </Community> <Community name='ETDReviewer'> <Service>Reviewing</Service> </Community> <Community name='ETDCataloguer'> <Service>Cataloguing</Service> </Community> </Actor> ……… <SERVICE name ='Searching'> <SCENARIO name='SimpleSearching'> <NOTE>Simple scenario for an NDLTD site searching service</NOTE> <EVENT> <SENDER>Patron</SENDER> <RECEIVER>InterfaceManager</RECEIVER> <OPERATION name=SearchCriteria/> <PARAMETER>collection</PARAMETER> <PARAMETER>query</PARAMETER> </EVENT> <EVENT> <SENDER>InterfaceManager</SENDER> <RECEIVER>SearchManager</RECEIVER> <OPERATION name='Search'/> <PARAMETER>collection</PARAMETER> <PARAMETER>query</PARAMETER> </EVENT> <EVENT> <SENDER>SearchManager</SENDER> <RECEIVER>InterfaceManager</RECEIVER> <PARAMETER name='Results'>WtdSet </PARAMETER> </EVENT> …. <document name=`ETD'> <stream_enumeration> <stream value=`ETDText'> <stream value=`ETDAudio'> ... </stream_enumeration> <structured_stream> %XMLSchema% <structured_stream> </document>

  28. Outline • Motivation: the problem • Hypotheses and research questions • Part 1:Theory • 5S: introduction, formal definitions • The formal ontology • Part 2: Tools/Applications • Language • Visualization • Generation • Logging • Part 3: Quality • Conclusions, Future Work

  29. 5SGraph: A DL Modeling Tool • Help users model their own instances of a digital library (DL) in the 5S language (5SL). • A simple modeling process which enables rapid generation of digital libraries • Features • 5SGraph loads and displays a metamodel in a structured toolbox. • The structured editor of 5SGraph provides a top-down visual building environment for the DL designer. • 5SGraph produces syntactically correct 5SL files according to the visual model built by the designer.

  30. Overview of 5SGraph Workspace (instance model) Structured toolbox (metamodel)

  31. 5SGraph: Other Key Features • Flexible and extensible architecture • Reuse of models • Load, save, and change common (sub-)models • Synchronization of views • Enforcing of semantic constraints

  32. 5SGraph Evaluation: Usability Study

  33. Outline • Motivation: the problem • Hypotheses and research questions • Part 1:Theory • 5S: introduction, formal definitions • The formal ontology • Part 2: Tools/Applications • Language • Visualization • Generation • Logging • Part 3: Quality • Conclusions, Future Work

  34. 5SGen • Version 1 -- MARIAN as the target system • Focused on rich structures: semantic networks • Behavior attached to nodes/links • Version 2 -- Shifted for later work to componentized (ODL) approach • Focused on scenarios/societies • Structures/Spaces encapsulated within components (e.g., relational tables, indexes)

  35. Component Pool . . . Java ODL Search Wrapping import Java ODL Browse import Wrapping 5SGen – Version 2: ODL, Services, Scenarios 5SL 5SL - - Scenario Scenario Model Model (6) (6) DL DL 5SL 5SL - - Societies Societies XPath/JDOM XPath/JDOM Designer Designer Transform Transform Model Model (7) (7) (1) (1) DL DL StateChart StateChart Designer Designer Component Component Model Model (8) (8) Pool Pool XPATH/JDOM XPATH/JDOM Transform Transform . . (2) (2) Scenario Scenario . . Synthesis Synthesis (9) (9) . . 5SGen XMI:Class XMI:Class Java Java Deterministic Deterministic Model Model (3) (3) ODL ODL FSM FSM (10) (10) Search Search Wrapping Wrapping Xmi2Java Xmi2Java (4) (4) SMC SMC import import (11) (11) Java Java Java Java ODL ODL JSP JSP Finite Finite Java Java binds binds User User Browse Browse import import State Machine State Machine Classes Classes Interface Interface Class Class Model Model (5) (5) Wrapping Wrapping View View (13) (13) Controller Controller (12) (12) Generated DL Services Generated DL Services

  36. 5SGen • Proof of Concept: prototyping • CITIDEL • VIADUCT • NDLTD Union Catalog • BDBComp

  37. Outline • Motivation: the problem • Hypotheses and research questions • Part 1:Theory • 5S: introduction, formal definitions • The formal ontology • Part 2: Tools/Applications • Language • Visualization • Generation • Logging • Part 3: Quality • Conclusions, Future Work

  38. XML-based DL Log Standard • Log analysis is a source of information on: • How patrons really use DL services • How systems behave while supporting user information seeking activities • Used to: • Evaluate and enhance services • Guide allocation of resources • Common practice in the web setting • Supported by web servers, proxy caches • DL Logging can be more detailed.

  39. DL Logging Features • Captures high level user and system behaviors • Organized according to the 5S framework • Hierarchical organization (XML-based) • Centered on the notions of events • Record events related to initial user inputs and final system outputs • Help to understand user interactions and the perceived value of responses

  40. The XML Log Format Log Transaction Timestamp Statement SessionId MachineInfo Event ErrorInfo SessionInfo RegisterInfo Action StatusInfo Update StoreSysInfo Search Browse Collection Catalog SearchBy Timeout PresentationInfo QueryString

  41. Outline • Motivation: the problem • Hypotheses and research questions • Part 1:Theory • 5S: introduction, formal definitions • The formal ontology • Part 2: Tools/Applications • Language • Visualization • Generation • Logging • Part 3: Quality • Conclusions, Future Work

  42. Describing Quality in Digital Libraries • What’s a “good” digital library? • Central Concept: Quality! • Hypotheses of this work: • Formal theory can help to define “what’s a good digital library” by: • New formalizations of quality indicators for DLs within our 5S framework • Contextualizing these indicators/measures within the Information Life Cycle

  43. Quality Dimensions

  44. Digital Objects: Accessibility • A digital object is accessible by an DL actor or patron, if it • exists in the DL collections • is retrievable from the repository • is not restricted from access • by metadata on rights • for an actor or actor’s society

  45. Digital Objects: Pertinence • Inf(doi) = information carried by a digital object or any of its descriptions • IN(acj) = information need of an actor • Contextjk = an amalgam of societal factors which can impact the judgment of pertinence by acj at time k. • Factors include time, place, the actor’s history of interaction, task, and factors implicit in the interaction and ambient environment.

  46. Digital Objects: Pertinence • The pertinence of a digital object doi to a user acj is an indicator function Pertinence(doi, acj): Inf(doi)  IN(acj)  Contextjk defined as: • 1, if Inf(doi) is judged by acj to be informative with regards to IN(acj) in context Contextjk; • 0, otherwise

  47. Digital Objects: Relevance • Relevance (doi,q) 1, if doi is judged by an external-judge to be relevant to q 0, otherwise • Relevance Estimate • Rel(doi,q) = doiq/ |doi| |q| • Objective, public, social notion • Established by a general consensus in the field, not subjective, private judgment by an actor with an information need

  48. Metadata Specifications and Metadata Format: Completeness • Refers to the degree to which values are present in the description, according to a metadata standard. As far as an individual property is concerned, only two situations are possible: either a value is assigned to the property in question, or not. • Completeness(msx) = 1 - (no. of missing attributes in msx/ total attributes of the schema to which msx conforms)

  49. Metadata Specifications and Metadata Format: Completeness • OCLC NDLTD Union catalog

  50. Metadata Specifications and Metadata Format: Conformance • An attribute attxy of a metadata specification msx is cardinally conformant to a metadata format/standard if: • it appears at least once, if attxy is marked as mandatory; • its value is from the domain defined for attxy; • it does not appear more than once, if it is not marked as repeatable. • Conformance(msx) = ((attribute attxy of msx) degree of conformance of attxy)/ total attributes).

More Related