160 likes | 503 Views
Misinterpretation of data, the importance of metadata and STC math. DLI Atlantic Training April 2005. Data Misinterpretation: Crime Rates. Ebert & Roeper review of Michael Wilson movie “Michael Moore hates America” Ebert doubted claim that Canadian crime rate 2X the USA rate
E N D
Misinterpretation of data, the importance of metadataand STC math DLI Atlantic Training April 2005
Data Misinterpretation: Crime Rates • Ebert & Roeper review of Michael Wilson movie “Michael Moore hates America” Ebert doubted claim that Canadian crime rate 2X the USA rate • Moorelies.com | News: Whoa; Stuart Didn't See That One Coming • Ebert conceded that the statistics supported claim - figures were right • BUT - comparison of STC and US Bureau of Justice website shows how statistics misinterpreted
Comparative Crime Rates Simplistic comparison • Similar category titles on violent and property crimes but different definitions • Violent crime 2-3 times higher in US, property crimes close • Bureau of Justice Statistics Crime & Justice Data Online • Canadian Statistics - Crimes by type of offence
Data Misinterpretation:Drinking Habits of Canadians • Initial analysis of the 1990 Health Promotion Survey, indicated Canadians enjoyed an average 60 drinks per day….
Data Misinterpretation:Importance of Metadata 1990 Health Promotion Survey there were a series of questions about alcohol consumption. First they asked if the respondent EVER drank alcohol, and if YES asked if they drank within the last 12 months and if YES asked for number of drinks for each day for the past 7 days. The code book showed number of drinks per day as: 81 F4MON 2 0096‑0097 HOW MANY DRINKS DID YOU HAVE ON: MONDAY 00 NONE 4651 7334907 01:40 NUMBER OF DRINKS 403 2585080 41 MORE THAN 40 DRINKS 1 106 98 QUESTION NOT ASKED 7648 0567910 99 NOT STATED 89 155377 82 F4TUE 2 0098‑0099 HOW MANY DRINKS DID YOU HAVE ON: TUESDAY 00 NONE 4608 7306101 01:40 NUMBER OF DRINKS 1447 2613991 98 QUESTION NOT ASKED 764810567910 99 NOT STATED 89 155377 (Raw Weighted)
Metadata for PUMFS • With Public Use Microdata Files, the code book is very important • Gives questions asked and codes used for responses • “Missing values”, “refusals”, “don’t know” and “not applicable” numeric codes are often assigned • Not consistent in the numeric codes used • Numeric codes that to most software would seem to be valid response
Metadata STC Policy on Informing Users of Data Quality • In place since 1978 • Tightened up 2000 in response to 1999 AG report • Recognition that “All statistics are to some extent estimates” • Statistics to be used with awareness of strengths and weaknesses – “fitness for use” • Key tool is the Integrated Meta Database (Definitions, data sources and methods)
Metadata • Important to find STC metadata and use it • Definitions, Data Sources and Methods • Questionnaire and reporting guides • Survey Description • Data sources and methodology • Data Accuracy • Documentation • Contact us
Online CatalogueCanadian Community Health Survey: public use microdata file: Product main page
DLI WebsiteDLI - Canadian Community Health Survey Cycle 1.1 • DLI listserv: Ask and we will find out from the Division!
Use metadata to avoid key pitfalls • Collection methodology • Questionnaire • Data quality: sample size, response rates • Definitions • Conceptual changes • Survey coverage • Reweighting/rebasing
STC Math • Random rounding • Percentages and percentage points • Central tendencies (mean, median and mode) • Current vs constant dollars • Raw vs seasonally adjusted