130 likes | 139 Views
Explore the importance of meta data in data analysis and quality assurance. Learn how meta data helps interpret, analyze, and transform statistical data, ensuring data quality and usability. Discover general principles, documentation guidelines, and proposed content requirements for effective meta data management.
E N D
Compilation of Meta Data Presentation to OG6 Canberra, Australia May 2011
What is meta data? • Information used to describe other data • Everything you need to know about a particular set of data in order to understand and use it • Information about concepts, definitions, collection, processing, methodology, quality, etc.
What is meta data used for? • To help the user: • To interpret, understand, analyse the data • To judge the quality of the data & the “fitness for use” • To transform statistical data into information • To facilitate comparability of data • To support data producers: • To retain and transfer knowledge • To promote harmonization between data sets • To improve collection
Meta data is an integral part of quality assurance • Elements of data quality: • Relevance • Accuracy • Timeliness • Accessibility • Coherence • Interpretability
General principles for documentation • Provide users with the information necessary to understand both strengths and weaknesses • Allow users to determine whether the data meet their needs • Should be clear, organized, accessible • Should be integrated wherever necessary to support the user’s understanding • Should be standardized, mandatory, updated as required
Defining meta data content • See IRES chapter 9 for a template • Handout: Excerpt of the Statistics Canada “Policy on informing users of data quality and methodology” • Handout: Example of meta data documentation for Canada’s “Industrial Consumption of Energy” survey • What are the minimum requirements?
Proposed meta data content (1) • Survey/Product name • Objectives of survey: • Why are the data collected? • Who are the intended users? • Timeframe • Frequency of collection? • Reference period? • Collection period?
Proposed meta data content (2) • Concepts and definitions • Target population • Survey universe/sampling frame • Classifications used • Collection method • Direct survey (sample/census; mandatory/voluntary) • Administrative data sources
Proposed meta data content (3) • For sample surveys: • Sample size, sampling error • Response rates • Imputation rates • For administrative data: • Sources • Purpose of original collection • Merits/shortcomings of data (coverage, conceptual) • Processing, correction, reliability, caveats
Proposed meta data content (4) • Error detection • Missing data, entry errors, validity problems, edits, reconciliation • Imputation of missing data • Disclosure control • Rules of confidentiality, confidentiality analysis • Revisions • Policy, explanation of changes
Proposed meta data content (5) • Description of analytical methods used • Seasonal adjustment, rounding • Other explanatory notes • Breaks in time series • Other supporting documents • Questionnaires, reporting guides, procedures manuals
Concluding comments • Documentation has often been the last work done and the first work to be dropped • But it is important on many levels • Needs to be maintained & updated; standards and templates help • In the future, new surveys or changes may be meta data driven – a growing role and importance • To support planning, development • To encourage harmonization, integration
For more information… Andy Kohut Director, Manufacturing & Energy Division Statistics Canada 11th Floor, Jean Talon Building, section B-8 Ottawa, Ontario CANADA K1A 0T6 613-951-5858 Andy.Kohut@statcan.gc.ca