150 likes | 162 Views
2016 Census Public Use Microdata File (PUMF) Individuals File. David Price Census Operations Division May 8, 2019. 1. 2016 Census Long form PUMFs. Introduction
E N D
2016 CensusPublic Use Microdata File (PUMF) Individuals File David Price Census Operations Division May 8, 2019 1
2016 Census Long form PUMFs Introduction As Canada's central statistical office, Statistics Canada is required under the Statistics Act to "collect, compile, analyze, abstract and publish statistical information relating to the commercial, industrial, financial, social, economic and general activities and conditions of the people of Canada." In doing so, Statistics Canada provides many avenues of data access For the Census Program - Census landing page provides various data tabulations and data visualization As part of Census dissemination, we produce a microdata file that contains responses to the Census questions called the Public Use Microdata Files (PUMFs) PUMF microdata files have been processed so that no individual can be identified Census Program disseminates 2 standard PUMFs: Individuals PUMF – released February 5, 2019 Hierarchical PUMF – to be released Summer 2019 2
2016 Census Individuals PUMF content Individuals File PUMF 2011 Individuals PUMF was used as the base for the development of the 2016 PUMF Same levels of Geography included as 2011: Provinces (legal jurisdiction for education, health etc.) Census Metropolitan Areas (CMAs) for diversity studies (15 CMAs & 8 CMA groupings) Sample size 2.7% (same as 2011) 3
2016 Census Individuals PUMF content 930,421 individual microdata records Sample drawn from the Census Long Form respondents 123 variables 92 variables from the individuals/persons universe 31 variables from family, household and dwelling universe 16 replicate weights Population covered is persons in private households - doesn’t include people living in institutions, Canadian citizens living temporarily in other countries, full time members of Canadian Forces outside Canada, persons living in institutional collectives (dwellings, hospitals, nursing homes and penitentiaries, and persons in non-institutional collectives such as work camps, hotels, motels and student residences. 4
2016 Census Individuals PUMF variables New variables: Immigration: Admission category and applicant type (IMMCAT5) Housing core need indicator (HCORENEED_IND) Structural type of dwelling (DTYPE) Education: Major field of study STEM & BHASE (non-STEM) groupings (CIP2011_STEM_SUM) Shelter Cost (SHELCO) replaces Owner’s major payment (OMP) and Gross rent (GROSRT) More details added to: Other country of Citizenship (CITOTH) Place of birth (POB) Home language (HLANO) Mother tongue (MTNNO) 5
2016 Census Individuals PUMF variables Samples selected to provide proper representation at provincial and CMA level Each record represents a number of other individuals in the target population of the Census long-form that are not in the sample The WEIGHT value may be the same or different from one selected unit to another, depending on the province of residence. To estimate the population the weighting factor must be used Individuals selected are self-weighted (i.e. the same for all units within a province). Unlike 2011, where self-weighting was not possible and included some units with very large weights, for 2016 self-weighting has been achieved. All weights are equal to 1 / 2.7%=37.04 Exact Weights vary from province to province due to a small adjustment so that the sum of all weight factors of the selected records corresponds to the published number of individuals from the target universe. (The weight ranges from 37.01 to 37.12) 6
2016 Individuals PUMF Variables comparison 2011 & 2016 PUMFs 7
2016 Census Individuals File (Geography) Canada, Provinces and Territories Grouped 8
2016 Census Individuals File (Geography) Canada, Provinces and Territories Grouped 9
2016 Census Individuals PUMF variables Geography (34): Provinces (11) & CMA (23) – CMA (15 individual and 8 grouped) Demography (3): Age, Sex & Marital Status Mobility (4): 1 year ago & 5 year ago Aboriginal population (3): ABO ID, Band Membership & Registered Ethnicity (4) Ethnic origin, visible minority & population group Language (18): Home language, Mother Tongue, Language at work Place of birth/citizenship (10): Age at immigration, Citizenship, Generational Status &Place of Birth) Education (7): Attainment, Major Field, Highest Degree, Location, secondary school 10
2016 Census Individuals PUMF variables Labour market (9): Class of worker, Labour Force Status, NAICs, NOCS, work activity, weeks worked Journey to work (7): Distance, Mode, POW status, Duration, Leave, Occupancy, place of work province Income (32): Capital gains/loss, total income, low income, government transfer payments, Employment income, After tax & income tax Family composition (9): Census family economic family & presence of kids Households (3): Household size, household type, primary household maintainer Dwelling (12): Bedrooms, Suitability, Structural Type, Housing Core Indicator, Presence of mortgage & shelter cost Weighting (17): 1 + 16 replicate weights 11
2016 Census Individuals PUMF variables Protect respondents and ensure confidentiality This an anonymized microdata file Must anonymize certain respondent details to protect them – suppression done as data made not available Limit the number of variables and limit the detailed categories Variables analyzed, collapsed and grouped to ensure confidentiality Restriction on the detail of qualitative variables: Ethnicity, language, place of birth – limited details Occupation & industry aggregated Restriction on the detail of quantitative variables: Income and shelter cost rounded, top and bottom coded Cap the number of rooms & bedrooms Cap the household size 12
Data Quality and Response Rates 2016 was the “Best Census Ever” in Canada 2016 long-form: Sampling: 25% for all areas, except 100% for reserves and remote areas (canvasser) Response: Response rate of 96.9% (2011 NHS 68.6%) Responding population: 24% 13
Some information to note Not available vs Not applicable Data suppression: In order to ensure confidentiality of selected individuals, data is suppressed and is coded to ‘Not available’ category, thus users are aware of the total number of records (and their weight) that are missing Users can, if they so choose, attempt to model the ‘not available’ category to try to improve their estimate Not applicable: Certain respondent’s characteristics do not apply to the concept of a variable and these are coded as not applicable. For example, if you live in a house, you will not have any condo fees and therefore this concept is not applicable to a house owner as it applies to a condo owner. 14
QUESTIONS? Comments?For 2021 Census PUMFs – suggestions for improvements/additions? Sri Kanagarajah Chief, Census Client Services Census Operations Division Email:sri.kanagarajah@canada.ca 15