Open Annotation: Social Bookmarking and Annotation of eBooks

Open Annotation:Social Bookmarking and Annotation of eBooks Robert Sanderson rsanderson@lanl.gov Los Alamos National Laboratory Todd Carpenter • National Information Standards Organization Peter Brantley Internet Archive • http://www.openannotation.org/ This research is funded in part by the Andrew W. Mellon Foundation

Overview • Introduction • Open Annotation Model • Basics • Segments • Publish/Subscribe Model • Appendix: FAQ

Open Annotation Collaboration • Focus on interoperable sharing of annotations: • Web-centric and open, not application specific silos • Create, consume and interact in different environments • Build from a simple model for simple cases, to more detailed for complex requirements • Need for standards across platforms: • Many people will want to share annotations and highlights • Even if a reader doesn’t share her annotations with others, she will want to access them from different reading apps

Basic Model • The basic model has three resources: • Annotation (an RDF document) • Body (the ‘comment’ of the annotation) • Target (the resource the Body is ‘about’)

Basic Model Example

Segments of Resources • Most annotations are about part of a resource • Different segments for different media types: • Text: paragraph, arbitrary span of words • Image: rectangular or arbitrary shaped area • Audio: start and end time points, track name/number • Video: area and time points • Other: slice of a data set, volume in a 3d object, …

Segments of Resources • Web Architecture Segmentation: • A URI with a Fragment identifies part of the resource: • IETF Mime-type fragment identifiers; egxpointer • W3C Media Fragments URI specification for simple segments of media: image, audio, video • OAC introduces a method of constraining resources: • Introduce an approach for arbitrarily complex segments • Can be applied to Body or Target resource

Complex Constraints • Fragments are often not possible: • Introduce a Constraint that describes the segment of interest • And a ConstrainedTarget that identifies the segment of interest • Constraints are resources, so can be expressive and detailed

Constraint Example

Annotation Protocols Unlike previous systems, Open Annotation does not mandate a protocol. No reliance on a client/server combination gives the client autonomy to use different services as appropriate. Instead we promote a publish/subscribe methodology, where annotations may be stored and consumed from anywhere. Protocol: publish, subscribe, consume tied together

Publish/Subscribe Method We don’t specify how this transfer should occur publish

Publish/Subscribe Method Nor this. publish subscribe

Publish/Subscribe Method Nor this. publish subscribe consume

Publish/Subscribe Advantages • Client can use most appropriate method for transferring annotation to storage service • May already be mandated in different domains • Can use existing services without requiring them to change • Annotations are web resources in their own right • Can be protected for restricted access using existing technology • Have their own URIs for identity • Promotes a market-place of services, such as: • Archiving Annotations and resources for preservation • Enriching with additional metadata and information • Spam detection and filtering to provide trusted annotation feeds

OAC for eBooks: Open Questions • Need to have robust mechanism for determining the segment of interest: • Could be part of an image • Could be part of stable layout text • Could be part of reflowable text • Distrust of quoting passages: enough annotations and entire text is unprotected • Distrust of offsets: change in the text and Constraint will describe the wrong segment • Motivating public, rather than private, annotations is important • … As is filtering spam!

http://www.openannotation.org/

FAQ • Surely there's more to the model? • What about creator, modification time and so on? • I want to comment on an Annotation? • I want to annotate multiple parts at once? • How can the comment be part of the Annotation? • You mentioned URI Fragments? • How can my comment be part of another resource? • I want to use quoted passages, but not still protect the quotes? • I want to use character offsets, but know if the segment has changed? • What about highlighting with no comment? • What about different colors and styles of highlight? • What about just marking a location, like a bookmark?

What about Creator, Modification Time? • Any of the resources can have additional information attached, such as creator, date of creation, title, etc.

Additional Properties Example

I Want to Comment on an Annotation? • There can be further typing of the Annotation to clarify purpose. • Example: Replies are Annotations on Annotations.

Annotation Types Example

I Want to Annotate Multiple Parts at Once? • Many use cases for multiple targets for a single Annotation: • Comparison of two or more resources • Making a statement that applies to all of the resources • Making a statement about multiple parts of a resource • Enabled by allowing more than one hasTarget relationship.

Multiple Targets Example

How can the Comment be part of the Annotation? • Content may be contained within the Annotation document: • Important for client autonomy • Clients may be unable to mint new URIs for every resource • Clients may wish to transmit only a single document • Third parties can generate new URIs if the client does not • The W3C has a Content in RDF specification: • http://www.w3.org/TR/Content-in-RDF10/

Inline Body • Introduce a resource identified by a non resolvable URI (such as a UUID URN) as the Body. • Embed the data within the Annotation document using 'chars’ • from Content in RDF.

Inline Body Example

You Mentioned URI Fragments? • URI Fragments are a syntax for creating subsidiary URIs that identify part of the main resource • The syntax is defined per media type: • X/HTML: The named anchor or identified element • XML: An XPointer to the element(s) • PDF: Many options, especially page and viewrect • Plain Text: Either by character position or line position

Segments of Resources: W3C Media Fragments • Media Fragments allow anyone to create URIs that identify part of an image, audio or video resource. • The most common case is for rectangular areas of images: • http://www.example.org/image.jpg#xywh=50,100,640,480 • Link to the full resource as well, for all Fragment URIs

Media Fragments Example

How can my Comment be Part of another Resource? • The Body may also be constrained in the same way as Targets. • (the most complicated OAC data model diagram)

Constrained Body Example

I Want to use Quoted Passages, but Protect the Text?

I Want to use Offsets, but Know if the Text has Changed?

What about Highlighting with No Comment?

What about Highlighting with different Colors?

What about just Bookmarking a Location?

Open Annotation: Social Bookmarking and Annotation of eBooks

Open Annotation: Social Bookmarking and Annotation of eBooks

Presentation Transcript

Data Annotation for Classification

Introduction to ArcGIS

Lexical Semantics and Semantic Annotation

The Multimedia Semantic Web

‘The Death of Turnus ’ Annotation and Translation

A brief tutorial on RNA folding methods and resources…

New Technologies and the Future of Learning: Learning is open, social, personal, mobile...

Accelerating Corpus Annotation through Active Learning

Legend

Gene family classification using a semi-supervised learning method

BIOINFORMATICS Surveys

Named Entity Recognition gate.ac.uk/ nlp.shef.ac.uk/ Hamish Cunningham

Named Entity Recognition gate.ac.uk/ nlp.shef.ac.uk/ Hamish Cunningham

Temporal Information Extraction

GENOME ANNOTATION AND FUNCTIONAL GENOMICS The protein sequence perspective

Human Language Technology in Musing

PreAP Chemistry Chapter 4 Chapter 4 Annotation Questions Due to the box NOW!