150 likes | 220 Views
The National Digital Newspaper Program (NDNP). An NEH/LC Collaborative Program Enhancing access to historical newspapers Release: September 2006. NDNP Mission. Enhance access to all American newspapers
E N D
The National Digital Newspaper Program (NDNP) An NEH/LC Collaborative Program Enhancing access to historical newspapers Release: September 2006
NDNP Mission • Enhance access to all American newspapers • Improve access to products of United States Newspaper Program (USNP) using current technologies • Establish standards and “best practices” for newspaper digital reformatting and access • Use multi-phased approach for research and scaled development • Develop geographically-diverse program that benefits all US communities
Why Newspapers? • Newspapers: a unique resource for understanding the fundamentals of history • Democracy, free press, diverse geographic viewpoints at the community level • Enormous corpus of newspapers presents an archival challenge • Text-intensive layout is labor-intensive to search without reference tools • Digitization of microfilmed corpus economically feasible
Why a National Effort? • Voluminous, distributed collections • No one institution holds the “master collection” • Broad user-base for newspaper material • Think nationally, select locally • Comprehensive chronological coverage, eventually • Need for leadership to build on past national efforts (USNP)
LC’s Historical Newspaper Activities • 20-year NEH/LC collaboration of USNP • Existing national network of cooperative programs • Standards established for preservation microfilm • Standards established for descriptive metadata/ cataloging • American Memory’s “Stars and Stripes” • http://memory.loc.gov/ammem/sgphtml/sashtml/sashome.html • Proof-of-concept for historical newspaper format and description
What will NDNP Produce? • Web access to • National directory of US newspaper holdings (what, when, where) – based on USNP legacy data • More than 30 million page images of historical newspapers digitized primarily from microfilm, with full-text • Historical context of newspaper, printing tech, etc • Depository of duplicate digitized microfilm at LC
How? • Multi-partner program • NEH: Funds the program (“We the People” initiative) • LC: Aggregates, preserves and serves • Awardees: Selects and converts • Phase I – FY04-FY06 (Test bed) • NEH awardees (up to 10) with existing digital collections infrastructure and master microfilm negatives • 100,000 pages each + 100,000 LC pages by 2007 (from 1900-1910) • Microfilm reel analysis for research
Phase I Timeline 2004 July – NEH cooperative agreement guidelines issued, LC technical architecture under development October – Application deadline; 15 applications received 2005 April – NEH Awards announced May – Award conference held at LC 2006 September – NDNP application publicly available via Web
NDNP, September 2006 • Web access - American Chronicle • Newspaper Title Directory, 1693-present • Full-text of content w/in visual newspaper layout (page-level access) • Contextual historical material (Encyclopedia) • Converted content from all awardees • Initial time period covered: 1900-1910
Newspaper Title Directory • Re-use of CONSER and Newspaper Union List, created under USNP (maintained by OCLC) • 147,000 newspaper titles • 900,000 holdings records • Searchable, Web access to all USNP-collected data, tied to digitized issues when available, as well as external newspaper Web sites
Full Text with Page-level Access • Preserves integrity of primary historical content, text in context • Minimal metadata required to achieve reasonable search results • Economics of large-scale, large-format digitization • Allows creation of substantial content-base for research and development on additional search strategies and technologies
Digital Asset Specifications • Page Image - grayscale, 400 dpi, from microfilm • TIFF 6.0; JPEG 2000 (.jp2); PDF with Hidden Text • OCR • XML – NDNP/ALTO Schema • Page-level, uncorrected, column zones with “bounding box” mapping coordinates • Metadata • XML in METS/MODS for digital objects
Historical Context An Encyclopedia of Newspaper History • Brief essays for each title digitized • Publisher, geography, significant events covered, audience/community, politics • History of each participating state and the role of newspapers in its history • Presentations for technology developments, significant people, places, etc
Future Phases: 2007-2024 • Addition of new partners (continuation of Phase I test bed, to represent all 54 states and territories) • Increased efficiency in workflows, tools, technology, sustainable resources • Additional access capabilities, improved technology Aggregate ~ Preserve ~ Serve
For more information, contact ndnp@loc.gov Georgia Higley Head, Newspaper Section Serials and Government Publications Division Library of Congress ghig@loc.gov http://www.loc.gov/rr/news/