160 likes | 176 Views
Addressing user demand for microdata access, challenges of legislation, privacy, and security, and ABS's strategic partnerships and technology choices. Discusses the history of microdata dissemination at ABS, drivers for change, Information Management Transformation Program (IMTP), GSIM and GSBPM standards, Services Oriented Architecture (SOA), RADL, ABSDL, and REEM systems. Concludes with learnings from legacy systems, metadata, standards, confidentiality, privacy, integration challenges, and international collaborations in statistical information systems.
E N D
Microdata Dissemination Architectures and Systems Rochelle Thorne – Assistant Statistician, Technology Applications Branch, ABS for Meeting on the Management of Statistical Information Systems, May 2011
Presentation Outline • Microdata Dissemination – the business problem • Drivers and Strategy • ABS systems and architecture • Learnings and Challenges
1. Microdata Dissemination – The Business Problem The Business Problem • User demand for access to microdata is increasing at a rapid rate • Timely,automated access to data, including rich metadata • Issues • Legislation • Privacy and confidentiality • Security • Legacy systems • Lack of internal standards • Technology choices • Strategic partnerships with vendors • Performance • ……
1. Microdata Dissemination – The Business Problem Brief History of Microdata Dissemination in the ABS • Releasing Confidentialised Unit Record Files (CURFs) since 1985 • Remote Access Data Library (RADL) in 2003 • ABS Data Laboratory (ABSDL) in 2003 • TableBuilder in use since 2009 • Micro (administrative tool) now in production • Remote Execution Environment for Microdata (REEM) to be released in July 2011
2. Drivers and Strategy The Journey
2. Drivers and Strategy User Demand • Access to: • A wider range of microdata • More detailed unit record data • Linked and longitudinal datasets • Rich metadata • Real-time access to outputs • More flexible analytical tools • Automation (system to system interfaces)
2. Drivers and Strategy Information Management Transformation program (IMTP) • Major organisational change program – outcomes include: • Increased quality/reliability of ABS products and services • Increased granularity of data • Increased discoverability of ABS data / information • Increased access to ABS products / data • Decreased time to market of statistical products / data • Increased coherence of ABS / other data sources • Increased levels of service to developing countries within the region • International collaboration to develop a statistical industry
2. Drivers and Strategy Information Management Transformation program (IMTP) • Major organisational change program – outputs include: • Metadata infrastructure based on DDI and SDMX • Business Process Management System (BPMS) • Registries and repositories for data and metadata artefacts • Collaboration with other NSOs to produce common statistical infrastructure
2. Drivers and Strategy GSIM and GSBPM • Generic Statistics Information Model (GSIM) • Generic Statistical Business Process Model (GSBPM) • Key standards to “industrialise” the production of statistics • GSBPM = the reference model for statistical business processes • GSIM = information flows between the BSBPM components • Common terminology and definitions
2. Drivers and Strategy Services Oriented Architecture (SOA) • Reuse of well designed components • Loose coupling of systems, both internally and externally • Agility to meet client demands - extensible • “Plug in” or “pull out” components
3. ABS Systems and Architecture RADL • Remote job submission • SAS, SPSS and STATA • Clients are academics and policy researchers • Multi-tier, multi-server application with a Notes/Domino front-end and a Windows application server backend • + Allows client access to more detailed data than CURFs on CD-ROM • + ABS can intervene where privacy or confidentiality issues are identified • - Limited analytical functionality allowed • - Limited access to metadata • - Turnaround times can be slow
3. ABS Systems and Architecture ABSDL • On-site facility • Remote desktop over a segregated VLAN • + clients have access to richer CURFs • + realtime access to analysis outputs • + ABS can intervene where privacy or confidentiality issues are identified • - rarely used due to accessibility and cost
3. ABS Systems and Architecture REEM • Remote, real-time analysis • + Access to richer datasets • + Confidentiality processes performed on outputs • + Complex analysis services • + Metadata discovery • + Linked and longitudinal datasets • + geospatial mapping
4. Learnings and Challenges Learnings and Challenges • Legacy systems and business processes • Metadata Content • Standards • Confidentiality and Privacy • Integration • International Collaboration
4. Learnings and Challenges What are other organisations doing? • Focus on DDI and SDMX • Documentation of principles and guidelines • Looking to share statistical infrastructure