670 likes | 679 Views
Explore the different types of data sources available, including microdata and aggregate data, and learn how to access them for research purposes. Gain insight into public microdata, confidential microdata, and Canadian and international data sources.
E N D
The Data Search Principal Data Sources and Access Susan Mowers, Data Librarian Sarah Roach, Research Assistant
Outline Doors to Data … • Microdata • Aggregate data … • Microdata Search (hands-on: Odesi, SAS) • Public microdata • Confidential microdata: RDC and RTRA • Aggregate data (hands-on: extract) • Canadian: CANSIM & other • International: OECD, World Bank, Haver, IMF, …
Suggestion • Please logon to your computer • abcd###@uottawa.ca • yyyyddmmsin • Problems? We can help!
Doors to data • When to use microdata? For high degree of detail All variables at individual unit of analysis, leading to … Many choices about subject matter Greater range of statistical analyses possible
Doors to data • When to use aggregate data? Microdata not available? e.g., business survey microdata not readily available Need macroeconomic data? • e.g., region, country, provincial or city-level Need time-series data? • e.g., comparative values already calculated across time periods
Comparing data types … Aggregate data Microdata
Outline Doors to Data • Microdata • Aggregate data • Microdata Search (hands-on: Odesi, SAS) • Public microdata • Confidential microdata: RDC and RTRA • Aggregate data (hands-on: extract) • Canadian: CANSIM & other • International: OECD, World Bank, Haver, IMF…. We are here
Public Statistics Canada Microdata Access via the Library and Odesi
Public microdata • Confidentiality/privacy problems are resolved with PUMFs • Low-risk nature of public data • 24/7 access via Odesi to Statistics Canada public data* • Contact point for help: GSG Centre/MRT *& other sources, e.g., ICPSR [Link] and World Bank [Link] …
Let’s see! • Public microdata file • Personal income variable • [LINK to Odesi] • Note: • What type of data? • Would it be specific enough?
Let’s see (cont’d) Screen 1 - What type of data?- Would it be specific enough?
Let’s see! • Public microdata file • Cultural or racial origin variable • [Link to Odesi] • Note: • Do these values reflect the actual question and the level of detail asked? • Would it be specific enough?
Let’s see (cont’d) Screen 2 • Do these values reflect the actual question and the level of detail asked? • Would they be specific enough?
Let’s see! • Public microdata file • Is there a correlation between cultural / racial origin AND income? • [LINK to example from Odesi]
Let’s see (cont’d) Screen 3
Did you know? • Odesi provides both the public microdata files and codebooks Download both (data and codebook) • Download the data as a subset or full datafile • Always download the codebook • More info here, e.g., codebook LINK] and topical index LINK], or e-mail smowers@uottawa.ca
Practice:Download public data! • Download a subset. Note also this how-to video [LINK] • Download codebook & topical index
outline • Doors to Data • Microdata • Aggregate data • Microdata Search • Public microdata • Confidential microdata: RDC and RTRA • Aggregate data (hands-on: download) • Canadian: CANSIM and other • International: UN, OECD, World Bank, IMF, Haver We are here
Confidential Statistics Canada Microdata Access via the RDC and RTRA
Agenda • Why use confidential microdata? • Access via Research Data Centre (RDC) • Access via Real Time Remote Access (RTRA)
Why use confidential microdata? Need more specific data Public data has limitations. It often… • (1) aggregatescontinuous data, likeage and incomeand • (2) suppressesdetailedgeography
Let’s see! • Confidential synthetic file • Is there a correlation between cultural / racial origin AND income? • [Link to example from Odesi] Explanation: click here for information about uses for this synthetic data file.
Let’s see (cont’d) Screen 4
Why use confidential microdata? Need panel data • Panel data follow a panel of individuals over repeated cycles of a survey. • Public data limitation: • Public data files do include longitudinal data (for reasons of confidentiality)
Why use confidential microdata? No public data exists Public microdatasometimesofferslimitedsurveys. For example, itdoesn`t have … • The Uniform Crime Reporting Survey • The Canadian Cancer Registry • The Canadian Forces Mental Health Survey
Agenda • Why use confidential microdata? • Access via RDC • Access via RTRA We are here
What is the RDC? • The Research Data Centre (RDC) provides provides researchers access to confidential microdata. • Access is provided in a secure university setting.
Where is the RDC and how is it used? • The COOL RDC can be found on uOttawa campus on the 3rd floor of the Morriset library! • All work with the data must be done inside the RDC. • Output can be released to researchers by request pending vetting for disclosure risk
Application Process & Survey Availability To access the RDC there are 3 steps to follow: • Apply online on the SSHRC website • Complete a security screening • Sign a microdata research contract A list of the surveys available in the RDC can be found here: http://www.rdc-cdr.ca/datasets-and-surveys
Want more information? Zacharie TsalaDimbuene RDC Analyst Office: Morisset Library 322 Email: coolrdc@uottawa.ca Web site: [Link]
Agenda • Why use confidential microdata? • Access via RDC • Access via RTRA We are here
What is RTRA? RTRA (Real Time Remote Access) allows remote access to confidential microdata output Provides descriptive statistics RTRA can be particularly useful during the proposal stage of a research project.
How does RTRA work? • Submit code to Stats Can (online) indicating the statistics you want and received output within the hour. • Code is generated in SAS. • Training sessions are available for new RTRA researchers!
Availability of SAS and help SAS is available… Vanier Labs, or Free browser version also available online New to SAS? Training sessions are available.
RTRA Surveys *PUMF=Public Use Microdata File
How do I apply to RTRA? • Fill out and sign an application form [Link | Info] indicating which survey(s) you would like access to and email it to me at sarah.roach@uottawa.ca You should have access within two weeks!
More information? • Compare regular SAS code versus RTRA SAS code – CCHS 2012 example [Link]
More information? • RTRA code [Link]
uOttawa RTRA Web site [Link]
Outline • Doors to Data • Microdata • Aggregate data • Microdata Search (hands-on: Odesi, SAS) • Public microdata • Confidential microdata: RDC and RTRA • Aggregate data (hands-on: extract) • Canada: CANSIM & other • International: OECD, World Bank, Haver, IMF … We are here
Aggregate Data Canadian and International Sources
About aggregate data … • Unit of analysis is at the economy level, e.g., Canada, U.S., U.K., province/state … • Often is repeated time-series (aggregate) data
Unemployment rate (sa %)* / Labour Force Surveys 1995-2014 – forU.S., U.K., Canada Time seriesexample • U.S.: Civilian Unemployment Rate (SA, %) • SA, % • U.K.: Unemployment Rate: Aged 16 and Over [3-Mo Moving Avg](SA, %) • SA, % • lfs-g10-unemployment.EMF (G10) S111ELUR / S112ELUR 951-1144 *Calculated from Labour force status=unemployed from repeated cycles of Labour Force Surveys
Canadian aggregate data New database • CANSIM tables [Link] • Statistics Canada DLI data server! [Link] • Odesi, various • Conference Board of Canada e-Data (forecast data, metropolitan-level, confidence indices) [Link]
CANSIM Parts of a CANSIM table • Official government data from numerous sources, includes business surveys • Parts of a CANSIM table: • Title: • Revenue, expenditure and budgetary balance - Provincial administration, education and health quarterly (dollars x 1,000,000) • Table #: • 380-0081 • Dimensions: • Geography (1 item: Canada) • Seasonal adjustment: Adjusted, unadjusted. • Sub-sector accounts (3 items) Estimates: (120 items) • Time frame: • Q1, 1980– Current • Vector: • Each possible combination of categories and options in a table. Also called a series. • Time series: • A series (vector), measured over a number of years • Footnotes • Data definitions Source: Adatped from Kwantlen Polytechnic University. (2015). Statistics: CANSIM (Guide). http://libguides.kpu.ca/c.php?g=183875&p=1212158