1 / 24

Feel the Feed! InFuse and Dimensional Data for the UK Census and Beyond

Feel the Feed! InFuse and Dimensional Data for the UK Census and Beyond. Justin Hayes ESRC Census Dissemination Unit, Mimas Census 2011: Impact and Potential Exploring the Research Potential of the 2011 Census The University of Manchester 8 July 2011. Overview.

gretel
Download Presentation

Feel the Feed! InFuse and Dimensional Data for the UK Census and Beyond

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Feel the Feed!InFuse and Dimensional Data for the UK Census and Beyond Justin Hayes ESRC Census Dissemination Unit, Mimas Census 2011: Impact and Potential Exploring the Research Potential of the 2011 Census The University of Manchester 8 July 2011

  2. Overview • Delivering the research potential • Straightening out the 2001 Census • Collaboration with ONS for 2011 • Benefits all round • What’s next? • Visioning a future data explorer • How to get on board • InFuse demonstration

  3. Key Stakeholder Research • Data producers/providers • Data intermediaries/developers • End users

  4. Delivering the research potential • Creating high quality content involves enormous effort and expense • Delivery is the last 100 yards of the census marathon • Potential remains just potential until the census is used, transforming it into impact

  5. 46,145 yards Image credits: Wolfgang Kumm / European Pressphoto Agency

  6. 3% of 2001 Census budget on dissemination

  7. Delivery requirements • Understand • Find • Use

  8. Who to deliver to? • Everyone! • Censuses a key national resource • Make use easier for all to deliver best value • Encourage mass innovation • The coolest thing to do with your data will be thought of by someone else (Rufus Pollock) • Design with secondary use as a primary aim

  9. Barriers to use of 2001 Census • Fragmented data • Inconsistent structures • Unnecessary complexity • Poor integration of metadata/meaning • Confusing disclosure control • Difficult to deliver • Difficult to understand, find, use

  10. Age Banding in 2001 99 age bandings 76 unique to a single table

  11. 223 Separate Age Categories

  12. Straightening out the 2001 Census • Born of nine years of frustration followed by three years of hard work • Logical dimensional model based on SDMX Open Standard • Dissect and rationalise structures of original dataset to create new library • Integrate data and metadata using new structures

  13. Delivery via a 2001 Census Data Feed • Structured, dimensional dataset • Logical model based on SDMX Open Standard • Dataset Description • SDMX and RDF Open Standards • Communication and transfer • RESTful Web service with API(s) • Publication for internal and external (soon) use • Suite of appropriate operations

  14. Apps & Interfaces End Users Web Service & APIs Dataset Descriptions Developer Users

  15. The InFuse interface • http://infuse.mimas.ac.uk • In service from beginning of May • Iterative design approach driven by user requirements • Table-free, lightweight, generic, modular • Operations on entire dataset • Currently academic use only

  16. Collaboration with ONS for 2011 • 2001 Data Feed as feasibility study • Co-funding from ESRC and ONS to facilitate knowledge exchange • Assist with development of ONS 2011 API • Test data • Interface development • Richer dataset • Linked Data and development of 2001-2011 comparability

  17. Benefits:Data Producers/Providers • Data Production • Management, control and authority • Integration of metadata/paradata(?) • Integration with other datasets • Exploit Mass innovation • Contact with user communities • Achieve strategic priorities • More efficient, effective and cheaper!

  18. Benefits:Intermediaries/Developers • Lots of nicely structured and described data to mash • Easy automated/machine-to-machine operation • Generic, re-usable applications • Rapid development

  19. Benefits:End User • Easier to understand, find, use • Purpose-specific, user-centred interfaces • More time for bigger, better, faster research

  20. What’s next? • More and better data • Adoption of data feed approaches • Linked Data • Wider integration of datasets • Definitional • Geographical (GeoConvert web service) • Essential for B2011 in whatever form • Intelligent interfaces • Unforeseen innovation!

  21. How to get on board: Producers/Providers • Produce structured data to Open Standards • Publish via APIs • Design for secondary use • Collaborate to promote and develop Open Standards • Cultivate developer interest

  22. How to get on board:Intermediaries/Developers • Get tooled up! • Use Open Standards • Re-use and contribute code • Join and develop communities • Innovate to satisfy end user requirements • Lobby data producers/providers

  23. How to get on board:End Users • Understand, find, use • Generate more impacts • Make your requirements known • Tinker

More Related