1 / 20

Data dissemination at the University of Toronto

Data dissemination at the University of Toronto. Presentation to National Bureau of Statistics, China by L Ruus <laine.ruus@utoronto.ca> Data Library Services, Map and Data Library, University of Toronto 2009-12-08 <http://www.chass.utoronto.ca/datalib/misc/ssb_2009.ppt>. Outline.

brent
Download Presentation

Data dissemination at the University of Toronto

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data dissemination at the University of Toronto Presentation to National Bureau of Statistics, China by L Ruus <laine.ruus@utoronto.ca> Data Library Services, Map and Data Library, University of Toronto 2009-12-08 <http://www.chass.utoronto.ca/datalib/misc/ssb_2009.ppt>

  2. Outline • Introduction • Numeric data/statistics – Laine Ruus • Aggregate statistics • Time-series statistics • Microdata • Spatial data – Marcel Fortin

  3. Introduction

  4. University of Toronto

  5. Support for numeric data/statistics • Data Library Services (DLS) was created in 1988 • Objectives: • acquire, manage and preserve machine-readable data files needed to support empirical or statistical research and teaching activities of the University of Toronto, • provide access to machine-readable data files owned by the University of Toronto, • provide support for users of these machine-readable data files.

  6. UT/DLS collections include • About 5,000 numeric, spatial and textual research data files, primarily but not exclusively in the social sciences • mainly quantitative research data, including microdata, aggregate data and time-series databases

  7. Support for spatial data • GIS position established in 1999 • GIS & Map Library plus Data Library Services combined to form Map and Data Library in 2009

  8. Three major types of quantitative data • Aggregate data (statistics) • Time-series aggregate data (statistics) • Microdata/transaction-level data, from which aggregate and time-series data are created

  9. Service objectives • Access to data/statistics at point of need • Extraction of required data/statistics from larger databases • Manipulation: descriptive or inferential statistics, derived variables, etc • Display or download for further analysis

  10. Aggregatestatistics

  11. Aggregate statistics • Acquired from: • Government sources, eg Statistics Canada (various licences), such as census of population 1996 and later • Purchase ($$$) from various sources • Formats: • csv, MS Excel, Beyond 20/20: can be served from the www: • Run-time applications: cannot be served from the www

  12. Aggregate statistics • Access • Html-based finding aids for those that can be served from the www (DDI compliant) eg http://www.chass.utoronto.ca/datalib/inventory/3000/3798.htm • Or download and install (zip format) eg http://www.chass.utoronto.ca/datalib/inventory/3000/3614.htm • CHASS (Computing in the Humanities and Social Sciences) is developing OLAP-based application for about 2000 census files pre-1996 that are only available in flat-ascii fixed-field format files, eg http://r1.chass.utoronto.ca/olap/

  13. Time-series

  14. Time-series data • Acquired from • Statistics Canada (various licences) • IMF, UN, OECD, World Bank, commercial producers • Formats: • Complex hierarchical formats • Or run-time applications

  15. Time-series data • Access: • Remote access, with producer interface (eg World Development Indicators, OECD, Datastream, etc.) • CHASS purpose-written interfaces: CANSIM, Canadian company balance sheet or stock-price data, trade data, etc., eghttp://dc1.chass.utoronto.ca/

  16. Microdata Age Sex

  17. Microdata • Acquired from • Statistics Canada (Data Liberation Initiative (DLI) licence): public use microdata files • ICPSR, Roper, etc. memberships • Some are free on the www, eg ICVS, PISA, Pew surveys • Formats: • Usually flat ascii fixed-field format • Or SPSS, SAS, or Nesstar formats

  18. Microdata (continued) • Access: 3 major interfaces – Nesstar, SDA, VDC • Nesstar http://www.nesstar.com/ • Ontario universities: <odesi> project • Nesstar also used by many European data archives • Strong on documentation (DDI 2.0), weak on analysis

  19. Microdata (continued) • SDA http://sda.berkeley.edu/ • Used by ICPSR, Roper Center, IPUMS, etc. • Strong on statistical analysis, weaker on documentation (DDI 2.0 compliant) • Analysis: frequencies, means, ANOVA, correlations, regressions (multiple, logit and probit) • Some graphic display • Design effects • Disclosure control • http://www.chass.utoronto.ca/datalib/misc/mun09/sda_compare.htm

  20. Usage • DLS deals directly with approximately 2,300 –3,300 users per year • CHASS database usage: • 70,000 to 97,000 hits per year • Up to 58 subscribing universities in Canada and USA • SDA usage: • 858,595 hits in 9,944 visits in 2008 • 14 subscribing universities in Canada

More Related