1 / 19

Quality framework for the evaluation of administrative data (to be used for statistics)

Quality framework for the evaluation of administrative data (to be used for statistics). Piet J.H. Daas, Judit Arends-Tóth, Barry Schouten and Léander Kuivenhoven Statistics Netherlands. Overview. Reason for work View on Quality Starting point Combined results The quality framework

odette
Download Presentation

Quality framework for the evaluation of administrative data (to be used for statistics)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quality framework for the evaluationof administrative data (to be used for statistics) Piet J.H. Daas, Judit Arends-Tóth, Barry Schouten and Léander Kuivenhoven Statistics Netherlands

  2. Overview • Reason for work • View on Quality • Starting point • Combined results • The quality framework • Application • Future work

  3. Reason for work • Statistics Netherlands increases the use of data (sources) collected and maintained by others • To decrease response burden and costs • As a result: • More dependent on administrative data sources • Must be able to monitor the quality of such data sources • How?

  4. View on quality • Statistics Netherlands definition of the quality of administrative data sources: “Usability for the production of statistics” • Differs from quality as used by the data source maintainer • Often does not have statistical use in mind • Can’t use the quality report of the data source maintainer (if available)

  5. Starting point • Stat. Netherlands paper of Daas & Fonville • Register seminar 2007, Helsinki, Stat. Finland • Hands on approach, limited scope (Dutch view) • Should included experiences of others • Papers and books that studied the quality of administrative data sources and registers • Excluded paper that only focused on quality of surveys • Ended up with quite a limited lists of important papers: • Book of the Wallgren’s (S) • Eurostat paper on Quality of administrative Data • Work performed at ONS and by Thomas (UK) • UNECE paper of Nordic countries

  6. Combined results • Conclusions: • A general level of mutuality • The papers identified many similar quality aspects (quality indicators) • None of the ‘views’ on quality were exactly alike • How to combine all these views? • Something higher than a dimension was needed • Karr et al. (2006)* used the term Hyperdimension to distinguish different views on quality • Combine all quality aspects identified in all studies and new aspects in a single framework !! *Karr et al. (2006) Stat. Methodol. 3, pp. 137-173

  7. Quality framework • Framework has 4 hyperdimensions • Four views on the quality of the external data source • The hyperdimensions identified are: • Source→ Data source as a whole • Metadata → Conceptual metadata of data in source • Data→ Facts (values) in data source • Process→ Processing related quality aspects

  8. Levels distinguished: Quality framework levels

  9. 1) Source hyperdimension • Here the data source is viewed upon as a file delivered by the data source maintainer to the NSI • Dimensions (5): • Supplier, Relevance, Privacy and security, Delivery, and Procedures

  10. Source hyperdimension Hyper- Dimension Indicator Measurement method dimension SourceSupplier ContactName, Contact information Relevance Adm. burden Effect of use on adm. burden of NSI (time and money) Privacy and Legal provisionCheck if Personal Data security Protection act applies DeliveryCostsCosts of use for NSI

  11. 2) Metadata hyperdimension • Focuses on the conceptual metadata quality aspects of the data source. • Other metadata aspects (such as process meta) are not included • Dimensions (4): • Clarity, Comparability, Unique keys, and Data treatment by data source maintainer

  12. Metadata hyperdimension Hyper- Dimension Indicator Measurement method dimension MetadataClarity Population Description of the population definitionused in data source Unique keysIdentification Presence of unique keys keys present (which) Data treatment ChecksVariable value checks by data source performed maintainer Modifications Familiarity with data modifications

  13. 3) Data hyperdimension • Aspects related to data in the data source • All aspects are accuracy related • Actively being discussed at our office • Future changes are very likely • Dimensions (9) • Over coverage, Under coverage, Linkability, Unit non-response, Item non-response, Measurement, Processing, Precision, and Sensitivity

  14. Data hyperdimension Hyper- Dimension Indicator Measurement method dimension DataOver coverage Non-pop. unitsPercentage of units not belonging to population of NSI Linkability Linkable unitsPercentage if units linked MeasurementIncompatible Fraction of fields with violated records edit rules Processing Adjustment Fraction of fields adjusted Imputation Fraction of fields imputed R-index: Representative index; RMSE: Root mean square Error; MSE: Mean Square Error

  15. 4) Process hyperdimension • Focuses on the processing of the data source • by the data source maintainer • by the NSI • Not discussed here, future work • Framework was developed without specifically focusing at process related quality aspects • main focus was product related

  16. Scope of the framework • Developed for administrative data • Registers and other secondary data sources • It could also be used for surveys when: • the data is collected by an organization other than Statistics Netherlands • Do not use the whole framework for statistical data sources • To much, use only parts of it.

  17. Application of the framework • How to apply? • Source and Metadata hyperdimension • Checklists have been developed • Data hyperdimension • Methods of calculation have been proposed • Currently looking at a practical means to apply these • Process hyperdimension • Under investigation

  18. Future work • Evaluate various administrative data sources • Is the framework generally applicable to all external data sources? • Thoroughly test Source and Metadata checklists • Feed-back on usability by users • Calculation methods for Data • Determine the quality indicators for various sources • Study how to efficiently evaluate Data • E.g. Scripts or computer program • Study the quality aspects in the Process hyperdimension

  19. Questions?

More Related