110 likes | 143 Views
Digitizing a Small Collection of Relatively Well-Behaved Monographs. Or,. Where are NetLibrary, Questia and Ebrary When You Need Them???. Technical Planning Questions. Publishing / Presentation Requirements? Text Indexing & Navigation Requirements? Integration Into Broader Digital Library?
E N D
Digitizing a Small Collection of Relatively Well-Behaved Monographs Or, Where are NetLibrary, Questia and Ebrary When You Need Them???
Technical Planning Questions • Publishing / Presentation Requirements? • Text Indexing & Navigation Requirements? • Integration Into Broader Digital Library? • Long-term Archiving Requirements?
Both HTML & XML Preserve look & feel of original Preserve pagination of original Value-added navigation Value-added text indexing Example:Destruction of the IndiesXML and HTML 1. VRR Publishing / Presentation Requirements
Full-text & cross text indexing: XPAT Hyperlinked TOC & footnotes, but not back-of-book index Chapter-level navigation, but no ‘page turning’ 2. VRR Text Indexing & Navigation Requirements
Texts cataloged in OPAC Listed on our LibraryWeb Available for Course Access Assigned ‘Permanent URL’, e.g.:Bookmark as:http://www.columbia.edu/cgi-bin/cul/resolve?AUL4333 3. Integration of VRR Into Columbia’s Digital Library
Options unclear No explicit rights Possible ‘deep archiving’ at Columbia? 4. VRR Long-Term Archiving Requirements
Text Markup Considerations • Markup Language? • Character Sets? • Images, figures, non-roman characters? • Preservation of Original Text Layout? • Level of Encoding Detail? • Value-Added Link Creation?
5. VRR Markup Language • TEI-Lite XML • ‘TEI Best Practices’ Level 4 • HTML 4.01 (sortof)
6. VRR Character Sets • "ISO 8879:1986//ENTITIES Added Latin 1//EN“ • HTML Character Entity Browser Compatibility Issues • Example:Left Single Quote
7. VRR Images, etc. • Scan and insert locally: • Plates • In-line illustrations (incl. tables, etc.) • Special character sets (incl. Greek, etc.) • [pending]
8. Preservation of Original Layout • XML: All front matter, versos w/ content, back matter & cover; HTML: same, but blank versos omitted • Line breaks in titles, captions and structured information preserved • Line breaks in text blocks not preserved; allowed to resize