1 / 39

XML for Libraries

XML for Libraries. Roy Tennant eScholarship California Digital Library escholarship.cdlib.org. Introduction. Goal: introduce you to XML, explain what it can do in general terms, and highlight particular uses Clarification: you will not learn enough to do it without further study.

mkoffler
Download Presentation

XML for Libraries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML for Libraries Roy Tennant eScholarship California Digital Library escholarship.cdlib.org

  2. Introduction • Goal: introduce you to XML, explain what it can do in general terms, and highlight particular uses • Clarification: you will not learn enough to do it without further study

  3. Outline • Introduction to XML • Serving XML to the Web • Case Studies • Tips & Advice • Resources

  4. Introduction to XML • Extensible Markup Language • A method of creating and using tags to identify the structure and contents of a document — not how it should be displayed • The tags used can be arbitrary or can come from a specification

  5. What it Looks Like <?xml version="1.0"?> <book> <author> <lastname>Tennant</lastname> <firstname>Roy</lastname> </author> <title>The Great American Novel</title> <chapter number=“1”> <chaptitle>It Was Dark and Stormy</chaptitle> <p> “I’m scared,” I said.</p> </chapter> </book>

  6. Two Types of XML • Well-Formed • Valid

  7. Well-Formed XML • Follows general tagging rules: • All tags begin and end • But can be minimized if empty: <br/> instead of <br></br> • All tags are case sensitive • All tags must be properly nested: • <author> <firstname>Mark</firstname><lastname>Twain</lastname> </author> • All attribute values are quoted: • <subject scheme=“LCSH”>Music</subject> • Has identification & declaration tags • Software can make sure a document follows these rules

  8. Valid XML • Uses only specific tags and rules as codified by one of: • A document type definition (DTD) • A schema definition • Only the tags listed by the schema or DTD can be used • Software can take a DTD or schema and verify that a document adheres to the rules • Editing software can prevent an author from using anything except allowed tags

  9. Ways to Use XML • Behind the scenes as a standard and easily transformed format for information • As a transfer syntax, to exchange information in a machine-parseable form • As a method of delivery direct to the user (not recommended)

  10. Why is XML Important? • It is a standard, easily extensible way to encode loosely-structured as well as highly-structured information • Due to its easy parseability, software can transform it in countless ways, thereby allowing: • Easy migration paths • Alternative displays • On-the-fly response to user needs

  11. XML vs. Databases(a simplistic formula) • If your information is… • Tightly structured • Fixed field length • Massive numbers of individual items • You need a database • If your information is… • Loosely structured • Variable field length • Massive record size • You need XML

  12. Serving XML to the Web • Directly in native form • Transformed to static HTML • Transformed to HTML dynamically

  13. Transforming XML: XSLT • XML Stylesheet Language — Transformations (XSLT) • A markup language and programming syntax for processing XML • Is most often used to: • Transform XML to HTML for delivery to standard web clients • Transform XML from one set of XML tags to another • Transform XML into another syntax/system

  14. Required Components for Serving XML to the Web • An XML-encoded “document” • An XSLT stylesheet to… • …transform it to HTML or XHTML: • Static • Dynamic • A CSS stylesheet (optional)

  15. XML Web Publishing Software • Required to: • Apply dynamic transformations to XML content • Render HTML dynamically for standard web browsers • A couple examples, both free: • Cocoon: http://xml.apache.org/cocoon/ • AxKit: http://axkit.org/

  16. Case Study: Publishing Books @ the California Digital Library • Goals: • To create highly usable online versions of books • To create versions that will migrate easily as technology changes • To create an infrastructure that will support dynamic presentations of the same content

  17. Case Study: Publishing Books @ the California Digital Library • Strategy: • Markup the texts in XML • Serve them dynamically using XML web publishing software (currently Cocoon) • Create different displays for different purposes, and a mechanism for allowing the user to select their preferred view • Find and apply an XML-aware search engine • Create a method by which users can create their own Adobe Acrobat versions

  18. AxKit mod_perl Web Server

  19. Cocoon Tomcat Web Server

  20. Cocoon Tomcat Web Server I want this XML doc…

  21. XSLT Stylesheet XML Doc Cocoon Tomcat Web Server

  22. XSLT Stylesheet XML Doc XHTML Document (no displaymarkup)* Cocoon Tomcat HTML Stylesheet (CSS) Web Server * Dynamic document

  23. Transformation XSLT Stylesheet Information Presentation XML Doc XHTML Document (no displaymarkup)* Cocoon Tomcat HTML Stylesheet (CSS) Web Server * Dynamic document

  24. What books have these words?

  25. These

  26. Where are the words found in these books?

  27. In these parts

  28. Case Study: ILL ASAP

  29. Service Tasmania Architecture

  30. Case Study: Univ. of Michigan

  31. Begin transitioning to XML now: XHTML and CSS for web files, XML for static documents with long-term worth Do not rely on browser support of XML DTDs? We don’t need no stinkin’ DTDs! Get on the XML4Lib discussion list:http://sunsite.berkeley.edu/XML4Lib/ Buy my book! Tips and Advice

  32. Resources • Web sites • Electronic discussions • Books • Magazines and journals • Individuals

More Related