1 / 36

Presentation Overview

Accessing Historical and Colonial Census data through the Australian Social Science Data Archive Dr . Steve McEachern Deputy Director, ASSDA . Presentation Overview. About ASSDA/ADA ADA in brief The ADA website ADA or ASSDA? 1966-1991 data What do we hold? How can I access it?.

mervin
Download Presentation

Presentation Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Accessing Historical and Colonial Census data through the Australian Social Science Data ArchiveDr. Steve McEachernDeputy Director, ASSDA

  2. Presentation Overview • About ASSDA/ADA • ADA in brief • The ADA website • ADA or ASSDA? • 1966-1991data • What do we hold? • How can I access it? • Historical and Colonial Census Data • Introduction to HCCDA • Searching • Browsing • Future directions for census data at ADA

  3. 1. About ASSDA/ADA

  4. ASSDA/ADA in Brief • ASSDA was set up in 1981, housed in the RSSS, ANU to collect and preserve Australian social science data on behalf of the social science research community • Now includes nodes at Uni of Melbourne, Uni of Queensland, Uni of WA, University of Technology Sydney, with infrastructure provided by the ANU Supercomputer Facility • The Archive holds some 2400 data sets, most notable holdings are national election studies; public opinion polls; social attitudes surveys – and CENSUS materials • Data holdings are sourced from academic, government and private sectors.

  5. ASSDA Data Holdings ASSDA data holdings cover a wide variety of subject areas, currently housed under the following major headings: • Ageing Well • Census • Demography • Economics • Education • Employment, Labour • Environment, Conservation, Land use • Health • Housing • Industry, Management • Law, Crime, Courts • Mass media, Communication, Language • Politics • Poll • Psychology • Science, Technology • Social classes, Social order • Social welfare • Sociology, Culture • Travel, Transport • Non-Australian studies

  6. ASSDA Front page

  7. ASSDA or ADA? • From July 2011, ASSDA will be changing names to ADA - The Australian Data Archive, with a new website: • http://www.ada.edu.au

  8. ADA Historical

  9. Current access • For now, access to ASSDA’s census data holdings is through our existing portal: http://www.assda.edu.au/census.html

  10. ASSDA Census Portal

  11. 2. Census 1966 - 1991

  12. Available data

  13. Available data

  14. Census documentation

  15. ASSDANesstar Census Portal

  16. What can you do in Nesstar? • View tables • Create new tabulations (from the existing set of table dimensions) • Generate charts • Export CSV and PDF files, and HTML or XML documentation • Bookmark your tables for future reference

  17. Study description

  18. 3. The Historical Census and Colonial Data Archive

  19. Introduction to HCCDA • The Historical Census and Colonial Data Archive (HCCDA) is a searchable archive of Australian Colonial census publications and reports. • Will become a sub-archive of the new Australian Data Archive (ASSDA). • Note that the archive contains colonial census reports and tables and not the raw census data.

  20. Source materials • Large corpus of potential source material – created by ABS as part of the 1988 Bicentennial program • Paper copies not available or too fragile • Fiche becoming harder to access • Fiche quality an issue (3rd generation?)

  21. Into the digital realm: images • First the easy part: scan fiche to digital images • Actually protracted, difficult and stressful • Scanning vendors approach highly automated • Saved by manual Q/A by ASSDA staff • Complicated by file/page numbering • (more on this later)

  22. Into the digital realm: content • Now the hard part: content conversion • Plain text conversion not good enough • Documents are semantically rich • Documents have rich structure • Tables are valuable in their own right • OCR conversion not good enough • Human data-entry is very good

  23. Into the digital realm: XML • XML can capture semantics and structure • XML based workflows in the future • But which XML? TEI, DocBook, custom? • Chose DocBook V5.0 • Exit strategy: convert to another schema • Created 160+ page archive markup guide • XML created quickly and superbly by InfoCube

  24. Result: what’s available? • 3 versions of the image from fiche: • Large, medium and small, for rendering in various situations • Full text in XHTML format

  25. Summary of contents

  26. Browsing HCCDA • Browsing is done on a “By document” basis: • Page by page • Table by table • Search is done on the XHTML markup

  27. Browse

  28. Page browse

  29. Image browse

  30. Search (and results)

  31. Why full-text? Definitions!!

  32. And table lookups!!

  33. 4. Future directions

  34. What is still to come? • Migration to the new ADA website • Full export of HCCDA results (CSV, XML, other suggestions?) • Additional of census tables into ADA’s Nesstar online analysis system • Additional census years (1996 onwards) • Bridging the gap: 1911-1961 …. ???

  35. Questions or comments? For further information Web: http://www.assda.edu.au From July: http://www.ada.edu.au Email: steven.mceachern@anu.edu.au Phone x52200

More Related