80 likes | 209 Views
Making an Electronic Text. An exercise in preservation and applied technology. Charles Hindley’s Curiosities of Street Literature. Published in 1871 only 456 copies printed This book is a collection of broadsides, ballads, and popular stories in Dickensian London. What we are doing.
E N D
Making an Electronic Text • An exercise in preservation and applied technology
Charles Hindley’s Curiosities of Street Literature • Published in 1871 • only 456 copies printed • This book is a collection of broadsides, ballads, and popular stories in Dickensian London
What we are doing • Using High quality scanned images and OCR software we have created text documents from the scanned images • Using XML we are then able to “Mark-up” the documents for display on the web. • We are following a defined standard for electronic texts. The TEI, or Text Encoding Initiative.
Text Encoding Initiative • This standard was defined by the University of Oxford, Brown University, University of Bergen, and the University of Virginia • TEI consortium formulated their guidelines to facilitate interchange between individuals and groups using different programs and computer systems over a broad range of applications
To make the TEI defined documents as accessible as possible a cross platform mark-up language was chosen • A mark-up language can be as simple as HTML (Hyper Text Mark-up Language) • As complex as LaTeX • As user definable as XML (eXtensible Mark-up Language)
XML Why it’s good for you • eXtensible Mark-up Language • Chosen By TEI for it’s cross platform, multi-application capabilities. • The user defines the mark-up in XML • custom tag and search XML documents based on those tags
The Images • Each image, scanned saves as a 40 Megabyte uncompressed TIFF • Using OCR (optical character recognition) software, we are able to preserve the text.
The Text • Once the image has been OCR’ed, a text document is created • these text documents can then be marked up in XML • Markup can be done is software or manually