360 likes | 451 Views
Accessing Historical and Colonial Census data through the Australian Social Science Data Archive Dr . Steve McEachern Deputy Director, ASSDA . Presentation Overview. About ASSDA/ADA ADA in brief The ADA website ADA or ASSDA? 1966-1991 data What do we hold? How can I access it?.
E N D
Accessing Historical and Colonial Census data through the Australian Social Science Data ArchiveDr. Steve McEachernDeputy Director, ASSDA
Presentation Overview • About ASSDA/ADA • ADA in brief • The ADA website • ADA or ASSDA? • 1966-1991data • What do we hold? • How can I access it? • Historical and Colonial Census Data • Introduction to HCCDA • Searching • Browsing • Future directions for census data at ADA
ASSDA/ADA in Brief • ASSDA was set up in 1981, housed in the RSSS, ANU to collect and preserve Australian social science data on behalf of the social science research community • Now includes nodes at Uni of Melbourne, Uni of Queensland, Uni of WA, University of Technology Sydney, with infrastructure provided by the ANU Supercomputer Facility • The Archive holds some 2400 data sets, most notable holdings are national election studies; public opinion polls; social attitudes surveys – and CENSUS materials • Data holdings are sourced from academic, government and private sectors.
ASSDA Data Holdings ASSDA data holdings cover a wide variety of subject areas, currently housed under the following major headings: • Ageing Well • Census • Demography • Economics • Education • Employment, Labour • Environment, Conservation, Land use • Health • Housing • Industry, Management • Law, Crime, Courts • Mass media, Communication, Language • Politics • Poll • Psychology • Science, Technology • Social classes, Social order • Social welfare • Sociology, Culture • Travel, Transport • Non-Australian studies
ASSDA or ADA? • From July 2011, ASSDA will be changing names to ADA - The Australian Data Archive, with a new website: • http://www.ada.edu.au
Current access • For now, access to ASSDA’s census data holdings is through our existing portal: http://www.assda.edu.au/census.html
What can you do in Nesstar? • View tables • Create new tabulations (from the existing set of table dimensions) • Generate charts • Export CSV and PDF files, and HTML or XML documentation • Bookmark your tables for future reference
Introduction to HCCDA • The Historical Census and Colonial Data Archive (HCCDA) is a searchable archive of Australian Colonial census publications and reports. • Will become a sub-archive of the new Australian Data Archive (ASSDA). • Note that the archive contains colonial census reports and tables and not the raw census data.
Source materials • Large corpus of potential source material – created by ABS as part of the 1988 Bicentennial program • Paper copies not available or too fragile • Fiche becoming harder to access • Fiche quality an issue (3rd generation?)
Into the digital realm: images • First the easy part: scan fiche to digital images • Actually protracted, difficult and stressful • Scanning vendors approach highly automated • Saved by manual Q/A by ASSDA staff • Complicated by file/page numbering • (more on this later)
Into the digital realm: content • Now the hard part: content conversion • Plain text conversion not good enough • Documents are semantically rich • Documents have rich structure • Tables are valuable in their own right • OCR conversion not good enough • Human data-entry is very good
Into the digital realm: XML • XML can capture semantics and structure • XML based workflows in the future • But which XML? TEI, DocBook, custom? • Chose DocBook V5.0 • Exit strategy: convert to another schema • Created 160+ page archive markup guide • XML created quickly and superbly by InfoCube
Result: what’s available? • 3 versions of the image from fiche: • Large, medium and small, for rendering in various situations • Full text in XHTML format
Browsing HCCDA • Browsing is done on a “By document” basis: • Page by page • Table by table • Search is done on the XHTML markup
What is still to come? • Migration to the new ADA website • Full export of HCCDA results (CSV, XML, other suggestions?) • Additional of census tables into ADA’s Nesstar online analysis system • Additional census years (1996 onwards) • Bridging the gap: 1911-1961 …. ???
Questions or comments? For further information Web: http://www.assda.edu.au From July: http://www.ada.edu.au Email: steven.mceachern@anu.edu.au Phone x52200