160 likes | 173 Views
Microdata Dissemination Architectures and Systems. Rochelle Thorne – Assistant Statistician, Technology Applications Branch, ABS for Meeting on the Management of Statistical Information Systems, May 2011. Presentation Outline. Microdata Dissemination – the business problem
E N D
Microdata Dissemination Architectures and Systems Rochelle Thorne – Assistant Statistician, Technology Applications Branch, ABS for Meeting on the Management of Statistical Information Systems, May 2011
Presentation Outline • Microdata Dissemination – the business problem • Drivers and Strategy • ABS systems and architecture • Learnings and Challenges
1. Microdata Dissemination – The Business Problem The Business Problem • User demand for access to microdata is increasing at a rapid rate • Timely,automated access to data, including rich metadata • Issues • Legislation • Privacy and confidentiality • Security • Legacy systems • Lack of internal standards • Technology choices • Strategic partnerships with vendors • Performance • ……
1. Microdata Dissemination – The Business Problem Brief History of Microdata Dissemination in the ABS • Releasing Confidentialised Unit Record Files (CURFs) since 1985 • Remote Access Data Library (RADL) in 2003 • ABS Data Laboratory (ABSDL) in 2003 • TableBuilder in use since 2009 • Micro (administrative tool) now in production • Remote Execution Environment for Microdata (REEM) to be released in July 2011
2. Drivers and Strategy The Journey
2. Drivers and Strategy User Demand • Access to: • A wider range of microdata • More detailed unit record data • Linked and longitudinal datasets • Rich metadata • Real-time access to outputs • More flexible analytical tools • Automation (system to system interfaces)
2. Drivers and Strategy Information Management Transformation program (IMTP) • Major organisational change program – outcomes include: • Increased quality/reliability of ABS products and services • Increased granularity of data • Increased discoverability of ABS data / information • Increased access to ABS products / data • Decreased time to market of statistical products / data • Increased coherence of ABS / other data sources • Increased levels of service to developing countries within the region • International collaboration to develop a statistical industry
2. Drivers and Strategy Information Management Transformation program (IMTP) • Major organisational change program – outputs include: • Metadata infrastructure based on DDI and SDMX • Business Process Management System (BPMS) • Registries and repositories for data and metadata artefacts • Collaboration with other NSOs to produce common statistical infrastructure
2. Drivers and Strategy GSIM and GSBPM • Generic Statistics Information Model (GSIM) • Generic Statistical Business Process Model (GSBPM) • Key standards to “industrialise” the production of statistics • GSBPM = the reference model for statistical business processes • GSIM = information flows between the BSBPM components • Common terminology and definitions
2. Drivers and Strategy Services Oriented Architecture (SOA) • Reuse of well designed components • Loose coupling of systems, both internally and externally • Agility to meet client demands - extensible • “Plug in” or “pull out” components
3. ABS Systems and Architecture RADL • Remote job submission • SAS, SPSS and STATA • Clients are academics and policy researchers • Multi-tier, multi-server application with a Notes/Domino front-end and a Windows application server backend • + Allows client access to more detailed data than CURFs on CD-ROM • + ABS can intervene where privacy or confidentiality issues are identified • - Limited analytical functionality allowed • - Limited access to metadata • - Turnaround times can be slow
3. ABS Systems and Architecture ABSDL • On-site facility • Remote desktop over a segregated VLAN • + clients have access to richer CURFs • + realtime access to analysis outputs • + ABS can intervene where privacy or confidentiality issues are identified • - rarely used due to accessibility and cost
3. ABS Systems and Architecture REEM • Remote, real-time analysis • + Access to richer datasets • + Confidentiality processes performed on outputs • + Complex analysis services • + Metadata discovery • + Linked and longitudinal datasets • + geospatial mapping
4. Learnings and Challenges Learnings and Challenges • Legacy systems and business processes • Metadata Content • Standards • Confidentiality and Privacy • Integration • International Collaboration
4. Learnings and Challenges What are other organisations doing? • Focus on DDI and SDMX • Documentation of principles and guidelines • Looking to share statistical infrastructure