220 likes | 226 Views
Join the panel discussion on the implementation of DDI-L in statistical production and learn about the tools and applications used by statistical offices. Gain insights from experts in the field and explore the benefits of using DDI-L in data management processes.
E N D
Panellists • Dan Gilman (BLS) • BLS are an associate member of the DDI Alliance • Eric Rodriguez (INEGI) • AchimWackerow (DDI Alliance) • Arofan Gregory (Invited Expert) • Metadata Technology, Open Data Foundation • Jeremy Iverson & Dan Smith (Invited Experts) • Algenta Technologies (developed Colectica) • Plus…all METIS participants!
Starting points • Focus • Primarily DDI-L rather than DDI-C • Implementations • Very early • Statistics New Zealand (SNZ) • Australian Bureau of Statistics (ABS) • Project / team established with DDI in scope • French National Institute for Statistics and Economic Studies (INSEE) • Considering • Many - including Statistics Sweden, Statistics Norway, Statistics Canada, ONS
Q1 :To what extent is DDI really implemented within the statistical production process? • Early days for NSIs in terms of DDI-L • SNZ have used DDI-C for Archiving since 2006 • Data & metadata disseminated to statistical output areas & researchers via microdata access facilities • Positive experience in this regard led to decision to pursue “all of lifecycle” rather than “end of lifecycle” approach via DDI-L • Replace & enhance existing metadata management processes • RFT (Request for Tender) process (completed 2011) included definition of business needs & strategy, testing of market etc
continued (1.2) • ABS • REEM (Remote Execution Environment for Microdata) in production use on restricted basis • Microdata described using DDI-L for input to environment • Aggregate tabulations can be returned using SDMX • Next phase of development (analytical capabilities beyond simple tabulation) is underway • Proof of Concept (PoC) stage of Metadata Registry / Repository (MRR) development implemented elements of DDI model (delivered June 2011) • Extensive mapping between DDI-L and ABS Questionnaire Development Tool
continued (1.3) • Question may also warrant reference to implementation of DDI-L to support statistical production by organisations other than NSIs • Several are further along the path than NSIs, eg • University of Michigan Survey Research Center • Several applications for different phases of life cycle • Canadian RDC Network
Q2 : What are the based-DDI tools used in practice by the statistical offices (Colectica, others...) ? • SNZ plan to implement Colectica next year to support documentation & data management processes • Work underway to extract content from existing systems as DDI-L (eg household survey platform) via custom development • Investigating StatTransfer application which now supports DDI-L • Interested in DDI/Blaise interoperability
continued (2.2) • ABS • REEM • Developed by ABS in partnership with vendor • MRR PoC • Developed by ABS harnessing design expertise from consultants • Evaluating Colectica • Customised utilities/applications • eg extract data and metadata for REEM, in accordance with DDI-L specification, from existing repositories
Q3 : Are there applications or tools which communicate with DDI (Blaise, others...)? * * * * * Early Draft V0.2 of diagram – requires further quality assurance
continued (3.2) • Plus • MQDS (Michigan Questionnaire Documentation System) • extract comprehensive metadata from Blaise survey instruments & render as DDI • Assorted applications internal to agencies that developed them • OpenDDI (beta) • global catalog of DDI documented surveys • Inclusion of support for DDI-L in StatTransfer is seen by other vendors as a signal of its ongoing prominence
Q4a : Is there a repository of variables, questions ? • DDI Alliance has a major focus on re-use of metadata • a cornerstone for the design of DDI-L • practical work underway on business practices & processes (eg via case studies) to support re-use • Technical design of standard & availability of repositories & applications are “necessary but not sufficient” to achieve re-use in practice • Case studies will influence further technical support • Colectica Repository can be used for this role • eg, SNZ’s current reference metadata library to be replaced by Collectica • Various banks for variables, questions etc have been built by various agencies
Q4b : To what extent is a variable repository restrictive for the user? For example if he uses a name of a variable repository is he obliged to use the code list associated in the repository? • Very modular model to support reuse • Strong support for relating different objects • eg Variable X is the same as Variable Y in terms of concept, universe and response categories but the codes used to denote the response categories are different • Applications working with the DDI-L can harness this modularity, eg • User : I need a variable identical to that one except I need different codes • System : New Variable created which • reuses majority of “building blocks” for Existing Variable • has explicitly defined & recorded relationship to Existing Variable
Q5 What is the strategy to implement statistical standards with a view of covering the life cycle of statistical operations from an end-to-end perspective? • SNZ • implement improved metadata management in a staged manner across the end-to-end lifecycle • Full support for DDI across all of the statistical business process is not planned to be complete until at least 2020. • have focused on the documentary needs of the organisation • ensure reference metadata available to identify the studies SNZ undertake and to provide staff with key contextual information.
continued (5.2) • now expanding focus to include data-centric needs • including management of classifications and variables with DDI. • Strategy is to begin to apply DDI across a wide range of information objects across the statistical business process in order to better understand needs and the coverage of DDI.
continued (5.3) • In the next 12 months plan to start using DDI to describe Concepts, Variables, Classifications, Questions. • Ten year roadmap for metadata indicates systems implementations of classification, question and variable libraries, building upon initial implementation, to expand and enhance functionality and meet long term needs. • It is intended DDI-L will be the primary metadata standard within SNZ • Supplemented by others where required • (e.g. SDMX codelists for classifications) • Externally use a mix of DDI and SDMX depending on which is best suited to a particular use case.
continued (5.4) • ABS similar to SNZ • Aiming to achieve transformation by 2017 • Relatively greater early emphasis on “machine actionable” aspects for metadata driven processes • Planning to apply DDI and SDMX internally in accordance with “industry standard” practice (once that emerges) • Emphasis on facilitating, in practice, international collaboration & sharing regarding new methods and IT components • Currently expect GSIM (Generic Statistical Information Model) to be operationalised via DDI & SDMX in future. • ABS Transitional Model in meantime spans DDI & SDMX
Q5a Is it to implement only DDI awaiting the results of the SDMX/DDI dialogue in progress? • No • SNZ is using SDMX as part of dissemination platform and now starting to implement DDI across the business process. • During the next 12 months will map the metadata captured throughout the statistical business process with dissemination metadata needs. • Data dissemination uses an SDMX based system (OECD.stat) • By default will map DDI to SDMX for a particular use case. • Within 18 months expect metadata flows between DDI & SDMX based systems regardless of outcome of SDMX/DDI dialogue. • ABS : Similar • Recognise a range of community benefits if agencies flow metadata between DDI & SDMX on a consistent basis
Q5b : Is it to implement DDI and SDMX from the very beginning in the objective covering the life cycle of statistical operation end to end ? • Yes Q5c Other strategy? • 5b is key strategy
Q6: What are perspectives in term of versioning for DDI : frequency of important changes? • Good balance between • responsiveness to identified additional requirements and bug fixes, and • Stability • Similarly to SDMX • when there is a new release agencies do not need to upgrade unless & until they have a business driver • emphasis on testing, backwards compatibility • Advantages • Dual product line (C vs L) helps limit conflict of interest between users with simple needs & with advanced needs • DDI 3.0, 3.1, 3.2 more responsive than SDMX 2.0, 2.1 gap • New release framework is even more responsive
Q7: Which institute has to deliver metadata in DDI 2 ? DDI3 ? Which institute has to deliver metadata to data archives working with DDI2 ? DDI3 ? • SNZ does not have any requirement to disseminate data and metadata in DDI • Aim to encourage other government agencies to use DDI-L where possible to describe their data. • DDI Usage Map • SDMX/DDI combined usage map
Q8: Is there an implementation of metadata using Blaise and DDI ? • SNZ • No use of Blaise and DDI together. • Intention Blaise should be able to be generated from DDI based metadata repositories. • CF • Q3 • Colectica • MQDS
Q9 : NewAre there any evaluations of the DDI at statistical offices that show how well the DDI addresses their statistical metadata needs? • SNZ • Didn’t see question in advance • May have more information than ABS (eg to support RFT) • ABS • High level information written for ABS specific audience (jargon!) • Detailed examination in particular areas (eg questionnaires, microdata record relationships) positive, sometimes requiring minor extensions • Excellent idea to share general information on evaluations. • Aim for well balanced factual information • not written specifically to advocate a business case • Seek examples from other NSIs & other agencies