270 likes | 355 Views
Outline Introduction Types of Annotation and their Dependencies Logical Structure Conceptual Structure Referential Structure The Structures in Concert Conclusion. Searching the WWW today document retrieval keyword based search. user. document(s). query. result. search engine.
E N D
Outline • Introduction • Types of Annotation and their Dependencies • Logical Structure • Conceptual Structure • Referential Structure • The Structures in Concert • Conclusion
Searching the WWW today • document retrieval • keyword based search user document(s) query result search engine text images video audio … ? keywords + metadata documents to read search engine index document content
Metadata semantic annotation document + metadata ? Solution 1: manual annotation Problem: not efficient (expensive) Solution 2: data mining and automatic annotation Problem: domain dependent, unreliable,… both solutions alone are unsatisfying ….
There is already (often unused) Metadata semantic annotation document index TOC references index (conceptual knowledge) TOC (structural knowledge) references (referential knowledge) basis of semantic document annotation
Searching the WWW tomorrow (?) • fact retrieval (or at least extended document retrieval) • content based search document(s) user + metadata query personalsearch agent answer the answer reasoningdata mining documents with theirdependency structure
Outline • Introduction • Types of Annotation and their Dependencies • Logical Structure • Conceptual Structure • Referential Structure • The Structures in Concert • Conclusion
Documents, Tags, and Annotations <b> Lorem</b> ipsum dolor sit amet, <br/>consectetuer adipiscing elit. <br/> <a href=“……“ title=“..“/>Sed orci purus, semper eget, <br/>tristique quis, adipiscing <br/><!--<rdf:annotation user=“…“ tag=“…“…/> posuere, erat. Aenean <br/> ultricies odio id sem.Sed <br/><h1> nec felis sit ametante </h1>tempor sagittis. Vestibulum <br/>est nunc, lobortis cursus, <br/>semper vel, pulvinar sed, <br/> odio. Vestibulum blandit… document consists of annotations strings associate distinguished document parts with metadata smallest addressabledocument unit
Documents, Tags, and Annotations • Examples smallest document unit: word higher order units: sentence, paragraph, page, chapter, part, … book smallest document unit: pixel higher order units: blocks, macro blocks, slices, frames, objects, scenes, acts,… video
Outline • Introduction • Types of Annotation and their Dependencies • Logical Structure • Conceptual Structure • Referential Structure • The Structures in Concert • Conclusion
Logical Document Structure • Structural tags • can be specified • explicitely (structural information) or • implicitely (formatting information) • can be associated with names/titles • can be used for document navigation Chapter 2 Chapter 1 Paragraph 2.1 Paragraph 2.2 Paragraph 1.1 Paragraph 1.2 Paragraph 1.3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Logical Document Structure • Table of Contents (TOC) from structural tags 2. Complexity of Reasoning 1. Basic Description Logics 2.1 Introduction 1.1 OR-Branching finding a model 1.1 Introduction 1.2 Definition of thebasic formalism 1.3 Reasoning Algorithms page 1 page 2 page 3 page 4 page 5 page 6 page 7 page 8 page 9 page 10 page 11 page 12 page 13 page 14 page 15 • Basic Description Logics 1 • Introduction 1 • Definition of the basic formalism 5 • Reasoning algirithms 7 • Complexity of Reasoning 11 • Introduction 11 • OR-Branching: finding a model 12
Outline • Introduction • Types of Annotation and their Dependencies • Logical Structure • Conceptual Structure • Referential Structure • The Structures in Concert • Conclusion
Conceptual Document Structure • Can be considered as a kind of ontological skeleton • Covers concepts of the document and their relationships • Using implicitely given conceptual structure requires understanding of document content • Explicitely given conceptual structure (only a small fraction of entire conceptual structure) can be defined by • document author (e.g., index entries, external metadata) • document users (e.g., social tagging) • The conceptual document structure can also be used for document navigation
Conceptual Document Structure • Using explicitely given conceptual document structure together with logical document structure to define the document index root SUB SUB rodent field mouse SEA SUB SUB SUB SUB SUB SUB hamster dentision beaver habitat meadow vole prairie vole SUB SUB incisor rotationof teeth
Conceptual Document Structure root SUB SUB conceptualstructure rodent field mouse SEA SUB SUB SUB SUB SUB SUB hamster dentision beaver habitat meadow vole prairie vole SUB SUB incisor rotationof teeth 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 logicalstructure Paragraph1.1 Paragrah1.2 Paragraph1.3
Conceptual Document Structure rodent, 1 beaver, 10, 11 dentision incisor, 4 rotation of teeth, 5 hamster, 2 - 4 see also meadow vole … field mouse, 13, 15 prairie vole, 16 meadow vole, 16 habitat, 15 see also rodent Document Index
Outline • Introduction • Types of Annotation and their Dependencies • Logical Structure • Conceptual Structure • Referential Structure • The Structures in Concert • Conclusion
Referential Document Structure • Internal links:References between parts of the same document e.g., see / see also, footnotes, figures, comments… • External links:References between different documentse.g., bibliographic references and citations,… • Only a fraction of the entire referentialdocument structure is given explicitely • Graph Visualization (Link Graph) • together with logical document structure table of figure, references, …
Outline • Introduction • Types of Annotation and their Dependencies • Logical Structure • Conceptual Structure • Referential Structure • The Structures in Concert • Conclusion
The Structures in Concert root SUB SUB conceptualstructure rodent field mouse SUB SUB SUB SUB SUB SUB hamster dentision beaver habitat meadow vole prairie vole SUB SUB incisor rotationof teeth referentialstructure 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 logicalstructure Paragraph1.1 Paragrah1.2 Paragraph1.3
The Structures in Concert root SUB SUB conceptualstructure rodent field mouse SUB SUB SUB SUB SUB SUB hamster dentision beaver habitat meadow vole prairie vole SUB SUB incisor rotationof teeth referentialstructure 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 logicalstructure Paragraph1.1 Paragrah1.2 Paragraph1.3
The Structures in Concert • All three structures in concert can be used for • Document reading tours (extended document retrieval) • goal oriented selections of documents (what is mandatory to understand the topic under consideration?) • with additional reading directions (which document unit to read in what order) • by also considering user annotations, personalized reading tours can be suggested (dependent on prior knowledge of the user) • Collaborative authoring(avoiding ambiguities or duplicates, support index generation and cross referencing,…) • Compute answers…(with the help of sophisticated reasoning and additional means of data mining and content understanding)
Outline • Introduction • Types of Annotation and their Dependencies • Logical Structure • Conceptual Structure • Referential Structure • The Structures in Concert • Conclusion
Conclusion • Documents have intrinsic logical, conceptual and referential characteristics • There are complex dependencies among the document structures carrying those characteristics • Logical, conceptual, and referential structures along with their interdependencies should be made explicit ( meta data) • Applications should maintain and use those meta data, e.g. for • authoring • navigation • searching
Outline • Introduction • Types of Annotation and their Dependencies • Logical Structure • Conceptual Structure • Referential Structure • The Structures in Concert • Conclusion Thank your for your attention!
Related Work Topic Maps • Topic Maps represent concepts and relationships (conceptional structure and relational structure) Topic Map rodent part of 1 type association type role whole part topic 10 Topic Maps do notinclude the logicaldocument structure beaver dentision association 11 Resources