1 / 42

Met a-data Resources in Europe: within NSIs and from Dosis Projects

Explore the approaches and structures of meta-data in Europe, focusing on statistical systems, data warehouses, and the process-oriented model. Learn about the OECD and IMF templates, data quality criteria, and classification server requirements.

Download Presentation

Met a-data Resources in Europe: within NSIs and from Dosis Projects

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Meta-data Resources in Europe: within NSIs and from Dosis Projects Wilfried Grossmann Department of Statistics and Decision Support Systems University Vienna

  2. Contents • Introduction • Contents of Meta-data • IT- Structures for Meta-data • Processing Meta-data • Conclusions Metadata Resources in Europe

  3. Introduction Continuing hot topics in the meta-data discussion • Content-orientation versus IT-orientation There is a lack of communication between these two groups Metadata Resources in Europe

  4. Introduction • Meta-data providers versus meta-data users Who provides which type of information for whom? Metadata Resources in Europe

  5. Contents of Meta-data What kind of objects should be documented? • Basic statistical structures • Variables • Values • Data sets ____________________ • Statistical output • Statistical Systems • Statistical Processing Metadata Resources in Europe

  6. Contents of Meta-data Approaches towards meta-data content • The template oriented approach • The data warehouse approach • The process oriented approach Metadata Resources in Europe

  7. Contents of Meta-data The template oriented approach Templates defined by a number of working groups • For micro data and data sets DDI, Dublin Core • For (economic) macrodata OECD, IMF, ECE (Internet) Metadata Resources in Europe

  8. Contents of Meta-dataThe template oriented approach The OECD Template: Concepts and sources Data Collection Data manipulation by national source Data quality Data Transmission International Standards Data Storage and Manipulation by OECD Output preparation and delivery by OECD Metadata Resources in Europe

  9. Contents of Meta-data The template oriented approach The IMF Template: Coverage Periodicity Timeliness Quality of disseminated data Integrity of disseminated data Access by the public Metadata Resources in Europe

  10. Contents of Meta-data The template oriented approach Although the OECD approach seems more reliable from statistical point of view, IMF is favoured at the moment by international organisations (EUROSTAT) Metadata Resources in Europe

  11. Contents of Meta-dataThe warehouse approach • Integration of the data inside the NSIs in a data warehouse • Output and dissemination as first step • Meta-data are oriented towards the needs of the data warehouse Metadata Resources in Europe

  12. Contents of Meta-dataThe warehouse approach Projects in this direction in many NSI Best documentation: Australian Office Definitional meta-data Procedural meta-data Operational meta-data Systems meta-data Datasets meta-data Metadata Resources in Europe

  13. Contents of Meta-dataThe process oriented approach • Combines statistical and IT considerations • Statistical data are considered not as final products but as the result of a process chain • More detailed consideration of statistical terminology Metadata Resources in Europe

  14. Contents of Meta-dataThe process oriented approach Starting point was the SCB-DOC model (Rosen and Sundgren, 1991) • A sequence of templates accompanying the statistical production process • Ongoing activities at Statistics Sweden • A number of NSIs want to adopt the model Metadata Resources in Europe

  15. Contents of Meta-dataThe process oriented approach The IDARESA model Object oriented representation based on SCB-DOC with emphasis on possible semi-automatic processing Metadata Resources in Europe

  16. Contents of Meta-dataThe process oriented approach The US-Bureau of census model (Gillman, Appel et al. running project): Statistical system defined as an identifiable process .... to produce one or more deliverables Metadata Resources in Europe

  17. Contents of Meta-dataSummary Process oriented approach seems to be favourable for a number of reasons Two Examples: Classification servers Data Quality Metadata Resources in Europe

  18. Contents of Meta-dataSummary: Classification server A classification server should • Support unified use of terminology inside NSIs or international organisations • Support harmonisation between (international) standard classifications and locally defined (adapted) classifications Metadata Resources in Europe

  19. Contents of Meta-dataSummary: Classification server Requirements for a classification server • A data base supporting easy and user friendly manipulation of hierarchy trees • A mapping tool supporting the definition of correspondence tables between classifications • A management strategy for implementation Metadata Resources in Europe

  20. Contents of Meta-data Summary: Classification server Up to now only few successful implementations for partial solutions EUROSTAT (SIMONE-Server) New Zealand, Metadata Resources in Europe

  21. Contents of Meta-data Summary: Data Quality Data Quality • Criteria for quality of statistics are well known (Relevance, accuracy, timeliness, accessibility, comparability, coherence, completeness) • The problem • Achieve quality in the production process • Document quality by appropriate meta-data Metadata Resources in Europe

  22. Contents of Meta-dataSummary: Data Quality Experience shows that documentation quality is rather poor as soon as it is separated from the production process Example for an integration project SIDI-approach by ISTAT Metadata Resources in Europe

  23. IT Structures for Meta-data Internet and data warehouse offer new opportunities for • Meta-data and data repositories • Meta-data access and exchange Lead towards a more open policy in data dissemination Metadata Resources in Europe

  24. IT Structures for Meta-dataMeta-data repositories Approaches towards repositories • The thesaurus approach • The template oriented approach • The Data Warehouse oriented approach Metadata Resources in Europe

  25. IT Structures for Meta-dataMeta-data repositories Example for a thesaurus oriented approach EUROSTAT servers for concepts and definitions • Advantage: available on the Internet • Problem: Navigation not so easy Metadata Resources in Europe

  26. IT Structures for Meta-dataMeta-data repositories • Contents • Descriptions (dictionaries) • Semantic (coverage, standard classifications coherence of information) • Administration (responsible persons) • Selection (keywords, search facilities) Metadata Resources in Europe

  27. IT Structures for Meta-dataMeta-data repositories Example for the template oriented approach StatBase: supporting access to meta-data as well as data and reports • Meets quite well the requirements of OECD data template • No direct connection between data and meta-data Metadata Resources in Europe

  28. IT Structures for Meta-dataMeta-data repositories Example for the warehouse oriented approach StatLine(CBS): Based on data access from multidimensional tables (cubes) • Accompanying meta-information is only in Dutch • Extraction of special meta-data items is not so easy as in StatBase Metadata Resources in Europe

  29. IT Structures for Meta-dataMeta-data access and exchange Ongoing work in access and exchange • New Standards for access and exchange • Accessing distributed sources • Combination of information Metadata Resources in Europe

  30. IT Structures for Meta-dataMeta-data access and exchange Actual trends in standardization • Traditional standards for data and meta-data exchange like GESMES or CLASET will probably switch to XML-platform. • New standards from the Object Management Group (OMG) Metadata Resources in Europe

  31. IT Structures for Meta-dataMeta-data access and exchange Example MOF (Meta Object Facility) • Extensible Framework for meta-data model definition • Programming interface for storage and access of meta-data • Integration facilities across domains But note: This is a general approach for warehouses not necessarily tied with statistics Metadata Resources in Europe

  32. IT Structures for Meta-dataMeta-data access and exchange Example for Accessing and processing distributed sources ADDSIA: Accessing and processing distributed sources for analysis purposes • Minimum requirements for standardisation in advance • Orientation towards statistical problems Metadata Resources in Europe

  33. Processing Meta-data Goal • Data and meta-data are processed together <OldDataSets, OldMetadataSets>  <NewData, NewMetadata> Metadata Resources in Europe

  34. Processing Meta-data Advantages • Reduction of documentation effort • More consistency in meta-data Requirements • Software tools supporting this view • Operational models for meta-data Metadata Resources in Europe

  35. Processing Meta-data Up to know only prototypes with emphasis on different aspects of processing • The planning approach • The throughput approach • The transformation approach Metadata Resources in Europe

  36. Processing Meta-dataThe planning approach • Develop software tools (workbench) for setting up meta-data documentation BRIDGE/IMIM: • A desktop for planning surveys and statistical production • Meta-data generated in the planning phase are managed by the system • No data are processed Metadata Resources in Europe

  37. Processing Meta-dataThe planning approach • Improvement and adaptation of meta-data models for new tasks like quality and use of administrative sources SIDI (Statistics Italy) • Integration of quality in the statistical production process • Standardization of the production process Metadata Resources in Europe

  38. Processing Meta-dataThe throughput approach Use as much meta-data as possible from OldMeta-data to obtain NewMeta-data CBS (ongoing work): • Use BLAISE meta-data as input • Produce StatLine meta-data as output Metadata Resources in Europe

  39. Processing Meta-dataThe transformation approach Define meta-data algorithms for all types of data algorithms • Throughput meta-data • Modified meta-data • New meta-data • Meta-data summarization Metadata Resources in Europe

  40. Processing Meta-dataThe transformation approach IDARESA project Meta-data algorithms for elementary data base operations ISMIS Identification of added value in meta-data (new meta-data) Pursuit of the production process inside EUROSTAT Metadata Resources in Europe

  41. Processing Meta-dataThe transformation approach Metadata Resources in Europe

  42. Conclusions Is there progress in meta-data research and development? Yes, but rather slow because • There is a lack of co-ordination in research (Probably improved by a forthcoming meta-data working group) • There is an information gap between meta-data research groups and NSIs • NSIs seem to prefer their own solutions Metadata Resources in Europe

More Related