150 likes | 300 Views
Statistics New Zealand’s End-to-End Metadata Life-Cycle ”Creating a New Business Model for a National Statistical Office if the 21 st Century” Gary Dunnet Manager, Business Solutions gary.dunnet@stats.govt.nz. BmTS Scope.
E N D
Statistics New Zealand’s End-to-End Metadata Life-Cycle”Creating a New Business Model for a National Statistical Office if the 21st Century”Gary DunnetManager, Business Solutionsgary.dunnet@stats.govt.nz
BmTS Scope • A number of standard, generic end-to end processes for collection, analysis and dissemination of statistical data and information • Includes statistical methods • Covering business process life-cycle • To enable statisticians to focus on data quality and implemented best practice methods, greater coordination and effective resource utilisation. • A disciplined approach to data and metadata management, using a standard information lifecycle • An agreed enterprise-wide technical architecture
BmTS Success Criteria - Financial • A reduction in the operating cost to produce a statistical output (that are operating on a separate subject matter system) by between 10 – 20% after moving to the new business model • A reduction of 50% in the investment (of time and money) required to implement the end to end processes and systems required for a new statistical output
Generic Business Process Model From: To: Analyse Disseminate Need Design/ Build Collect Process Design/ Build Collect Analyse Need Disseminate Process
10. Workflow 4. Analytical Environment CURFS Imaging 5. Information Portal Admin. Data INFOS 6. Transformations Official Statistics System & Data Archive Output Channels Multi-Modal Collection 1. Input Data Store 2. Output Data Store Web CAI ‘UR’ Data Summary Data Raw Data Clean Data Aggregate Data E-Form RADL 8. Customer Management 7. Respondent Management 3. Metadata Store Statistical Process Knowledge Base 9. Reference Data Stores
Existing Metadata Issues • metadata is not kept up to date • metadata maintenance is considered a low priority • metadata is not held in a consistent way • relevant information is unavailable • there is confusion about what metadata needs to be stored • the existing metadata infrastructure is being under utilised • there is a failure to meet the metadata needs of advanced data users • it is difficult to find information unless you have some expertise or know it exists • there is inconsistent use of classifications/terminology • in some instances there is little information about data, where it came from, processes it has been under or even the question to which it relates
Target Metadata Principles • metadata is centrally accessible • metadata structure should be strongly linked to data • metadata is shared between data sets • content structure conforms to standards • metadata is managed from end-to-end in the data life cycle. • there is a registration process (workflow) associated with each metadata element • capture metadata at source, automatically • ensure the cost to producers is justified by the benefit to users • metadata is considered active • metadata is managed at as a high a level as is possible • metadata is readily available and useable in the context of client's information needs (internal or external) • track the use of some types of metadata (eg. classifications)
Metadata: End-to-End • Need • capture requirements eg usage of data, quality requirements • access existing data element concept definitions to clarify requirements • Design • capture constraints, basic dissemination plans eg products • capture design parameters that could be used to drive automated processes eg stratification • capture descriptive metadata about the collection - methodologies used • reuse or create required data definitions, questions, classifications • Build • capture operational metadata about selection process eg number in each stratum • access design metadata to drive selection process • Collect • capture metadata about the process • access procedural metadata about rules used to drive processes • capture metadata eg quality metrics
Metadata: End-to-End (2) • Process • capture metadata about operation of processes • access procedural metadata, eg edit parameters • create and/or reuse derivation definitions and imputation parameters • Analyse • capture metadata eg quality measures • access design parameters to drive estimation processes • capture information about quality assurance and sign-off of products • access definitional metadata to be used in creation of products • Disseminate • capture operational metadata • access procedural metadata about customers • Needed to support Search, Acquire, Analyse (incl; integrate), Report • capture re-use requirements, including importance of data - fitness for purpose • Archive or Destruction - detail on length of data life cycle.
Metadata: End-to-End - Worked Example Question Text: “Are you employed?” • Need • Concept discussed with users • Check International standards • Assess exisiting collections & questions • Design • Design question text, answers & methodologies • Align with output variables (e.g. ILO classifications) • Data model, supported through meta-model • Develop Business Process Model – process & data / metadata flows • Build • Concept Library – questions, answers & methods • ‘Plug & Play’ methods, with parameters (metadata) the key • System of linkages (no hard-coding)
Metadata: End-to-End - Worked Example Question Text: “Are you employed?” • Collect • Question, answers & methods rendered to questionnaire • Deliver respondents question • Confirm quality of concept • Process • Draw questions, answers & methods from meta-store • Business logic drawn from ‘rules engine’ • Analyse • Deliver question text, answers & methods to analyst • Search & Discover data, through metadata • Access knowledge-base (metadata) • Disseminate • Deliver question text, answers & methods to user • Archive question text, answers & methods
Metadata: Recent Practical Experiences • Generic data model – federated cluster design • Metadata the key • Corporately agreed dimensions • Data is integrateable, rather than integrated • Blaise to Input Data Environment • Exporting Blaise metadata • ‘Rules Engine’ • Based around s/sheet • Working with a workflow engine to improve (BPM based) • Audience Model • Public, professional, technical – added system