1 / 17

Development of UK Virtual Microdata Laboratory

Development of UK Virtual Microdata Laboratory. Felix Ritchie Shanghai, March 2010. Plan of presentation. Starting principles What we did, and the impact New things we had to develop security model, researcher management, SDC What we’ve learnt

hafwen
Download Presentation

Development of UK Virtual Microdata Laboratory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Development of UK Virtual Microdata Laboratory • Felix Ritchie • Shanghai, March 2010

  2. Plan of presentation • Starting principles • What we did, and the impact • New things we had to develop • security model, researcher management, SDC • What we’ve learnt • what matters, what doesn’t, what we’d do differently • Future directions

  3. Starting principles • Designed by researchers for research • maximum access, limited by law • Expandable • Secure at reasonable cost • Manageable at reasonable cost • Distribute access, not data

  4. Distributed access • Why is this good? • Data always under ONS control • Live monitoring • Simpler, but safer, disclosure control • How does this work in practice? • VML accessible from all ONS computers • Access points in govt. offices in Glasgow and Belfast • Plan to roll-out to more govt offices in 2010 • VML-duplicate set up on academic network • VML set to become exception rather than default data store

  5. What we did • Central data repository and processors • Access via secured thin clients • Work space partitioned by dataset, not usage • researchers get access to dataset, not variables • No access to internet or rest of network • Same system for internal and external users

  6. What we did - outcomes • 30%-50% growth every year • Massive increase in microeconomic analysis • Form almost no firm-level studies to European leaders • Keystone of ONS Administrative Data Project • Total cost ~£350,000 per year • strategy 17%, fixed ops 65% variable ops 18% • income ~£50,000

  7. New things developed (1)The VML Security Model • valid statistical purpose • trusted researchers • anonymisation of data • technical controls around data • disclosure control of results safe projects + safe people + safe data + safe setting + safe outputs  safe use

  8. New things developed (2)Output statistical disclosure control • ‘Standard’ SDC not appropriate • traditional rules not appropriate for research environments • SDC on data or methods pointless • Principles-based output SDC • SDC at the point of release • trained researchers • trained staff • agreement on principles and purpose • safe vs unsafe outputs, based on functional form

  9. New things developed (3)Active researcher management • Need to develop shared objectives with researchers • Principles-based SDC needs buy-in from researchers • Reduced management costs • Compulsory training • SDC • VML objectives and constraints • legal and procedural background

  10. What we’ve learnt (1)Things that matter • attitude to researchers • model of SDC • broad scale of operations • including future plans • scale of coherent networks • (for remote access) • eg ONS internal network, Government Secure Intranet, University Intranet, VPN?

  11. What we’ve learnt (2)Things that don’t matter • Location of servers and users • Type of users • Type of data • IT • Metadata • Specific legal/procedural framework?

  12. What we’ve learnt (3)Things we would do differently • Prepare ONS for expansion • senior buy-in • IT planning • better data management • better user management • better metadata

  13. Future directions • Expansion across the government network • Supporting academic equivalent • VML facing massive internal increase in use • Developing international standards • Better communication • wikis, FAQs, common metadata system • metadata • Not being considered • remote job systems • synthetic data

  14. Questions? Felix Ritchie felix.ritchie@ons.gsi.gov.uk Microdata Analysis and User Support maus@ons.gsi.gov.uk

  15. Old stuff – if necessary

  16. The data model (1) • ‘Spectrum’ of access points balancing • value of data • ease of use • disclosure risk • for a given level of confidentiality, maximise data use and convenience • no ‘one-size-fits-all’ solution • no absolute prohibitions • trade-off is made explicit • users determine appropriate level of access

  17. Use of confidential data: the access spectrum RDCs

More Related