80 likes | 212 Views
ESSnet on microdata linking and data warehousing in statistical production: Metadata Quality in the Statistical Data Warehouse. What are the types of Metadata?. Each item of metadata will normally fit into these categories: Active(e.g. SQL Scripts) OR Passive (e.g. A quality report )
E N D
ESSnet on microdata linking and data warehousing in statistical production: Metadata Quality in the Statistical Data Warehouse
What are the types of Metadata? • Each item of metadata will normally fit into these categories: • Active(e.g. SQL Scripts) OR Passive (e.g. A quality report ) • Formalised (e.g. Classification) OR Free-form(e.g. process documentation) • Structural(e.g. Classification codes) OR Reference(e.g. Survey Methodology text description) All of these categories of metadata could be subject to quality measurement
What is Quality? • General definition: ‘fitness for use, or purpose’. • ISO9000:2005 definition: the ‘degree to which a set of inherent characteristics fulfils requirements ‘.
International Standard for Metadata • The ISO 11179 standard has the data element as the fundamental concept within the context of a Metadata Registry (MDR). • The purpose of the MDR is to maintain a semantically precise structure of data elements. • ISO 11179 states that the main purposes of monitoring metadata quality are: • Monitoring adherence to rules for providing metadata for each data item • Monitoring adherence to conventions for forming definitions, creating names, and performing classification. • Determining whether an administered item still has relevance • Determining the similarity of related administered items and harmonizing their differences • Determining whether it is possible to ever get higher quality metadata for some administered items • But how do we measure quality?
Measurement of Quality • Many quality frameworks exist, from a variety of organisations around the world, usually quite similar. • The ESS Quality Framework is generally forms the basis of most member states’ individual quality frameworks • Quality frameworks use the term ‘Dimensions’ to represent the quality characteristics • Dimensions for the ESS framework: • Relevance • Accuracy • Timeliness and Punctuality • Accessibility and Clarity • Coherence • Comparability • *However, these frameworks generally apply to Statistical Outputs from a data perspective,rather than the metadata within the system
Quality Dimensions for Metadata in the SDWH • Process to establish quality dimensions for metadata: • Examination of the quality dimensions in use for outputs was carried out to see if any can be appropriately applied to the metadata in the different layers of the Statistical Data Warehouse. • Compilation of a list based on this examination:
Quality Dimensions for Metadata in the SDWH • Relevance - the degree to which statistical metadata meet current and potential user needs • Accuracy – the degree of closeness of descriptive metadata to the true value of the metadata • Accessibility – a measure of the ease with which users are able to access metadata in the SDWH. • Comparability – the degree to which metadata can be compared over time and domain. • Coherence – the degree to which statistical metadata enables the bringing together of statistical information from different sources within a broad analytical framework and over time. • Uniqueness – the degree to which a metadata item can be uniquely identified, named and defined. • Stability – the measure of how metadata remains stable over time, where appropriate • Completeness – the degree to which metadata items are present for statistical data • Interpretability - a measure of the availability of the supplementary information necessary to interpret and utilize it appropriately.
Work session – Metadata Quality Dimensions • What we would like you to do next • In your groups, examine and discuss the Quality Dimensions list in the context of the following questions: • Are these dimensions appropriate for metadata quality in the Statistical Data Warehouse? • Are any dimensions not really applicable in this context? • Have we missed any dimensions which are relevant to the quality of metadata in the Statistical Data Warehouse? • After lunch we will discuss feedback from each of the groups