310 likes | 387 Views
Structural analysis of the aggregate outputs from the 2011 Census to develop alternative integrated multidimensional conceptual models of data and geographies for easier management and dissemination. Justin Hayes UK Data Service. What the census tells us workshop Manchester 23 July 2014.
E N D
Structural analysis of the aggregate outputs from the 2011 Census to develop alternative integrated multidimensional conceptual models of data and geographies for easier management and dissemination Justin Hayes UK Data Service What the census tells us workshop Manchester 23 July 2014
Making it easier for everyone to find, understand and use the bits of the census they’re interested in Justin Hayes UK Data Service What the census tells us workshop Manchester 23 July 2014
Overview • Traditional and integrated approaches • Work with 2011 outputs • Integrated descriptive model • Integrated model of geographies • Ongoing work with data producers
Our job • Find • Understand • Use • Automated systems with online interfaces • Online and interactive support • Main services now freely available to everyone
Traditional tabular aggregate outputs • Outputs conceived and specified as tables • Details of individual tables defined through consultation with different user groups • Per-table categorisations and descriptions • Complex table universes and footnotes • Visual layout an important consideration • Extended metadata unattached • Complex process! • Number of tables limited by resource available • Numerous inconsistencies between tables • Effectively separate datasets
Traditional tabular dissemination • No global search • Data fragmented in complex and inconsistent tables • Limited table-level search • Difficult to find specific categories • ‘No data’ difficult to prove • Poor understanding • Universe and footnote information often overlooked • Extended metadata unattached • Heavyweight and inflexible applications • Dumb data • All intelligence built in application • Applications tied to specific data
Integrated aggregate outputs • Deconstruct tables • Assemble and rationalise all variables and categories in tables • Variable-ise table universes and footnotes • Create a standardised library of variables to describe all data • Define integrated models of characteristics (What?)and geographies (Where?) • Enables global operations/queries • Framework for Attachment of extended metadata • Facilitates description and transfer using standards • Provide access via Web service API • Data becomes self-describing
InFuse • Open access • Aggregate data from 2011 census across the UK • Makes data easy to • Find • Understand • Use • Global query using variable combinations • No tables! • “No data” fast!
Under the bonnet • Integrated multidimensional descriptive model • Integrated model of geographies • The really important bits!
InFuse 2011 release 2: Raw data • England and Wales Local and Detailed Characteristics to output area level • UK harmonised data to local authority level • 422 tables, mainly multivariate • 31 geography types • 241,334 areas • 11,311 files • 15Gb volume
Integrated descriptive model • Processing of raw metadata • Deconstruction, rationalisation and re-integration • Library of variables and categories • Re-insertion of data values • Attachment of associated metadata • Global description using standards • Global operations via Web service API • Data is self-describing • Enables lightweight, generic applications
Benefits of this work • Data producers • Efficient data management • Flexible output production • Best value • Application developers • Easy access to self describing web services • Light weight generic applications • End users • Quick and easy global search • Context along with data
InFuse 2011 release 2: Processed data • 97 variables • 2,501 categories • 281 variable combinations • 140 thousand category combinations • 4.6 billion values • A 460Km high stack of sticky notes! • Anticipating approximately 10 billion values in all
Integrated model of UK census geographies • Assembly of raw information on geographies • 31 geography types • 241,334 areas (anticipating ~ 2 million including postcodes) • Direct and indirect hierarchies • Simplified presentational model • 11 composite geography layers • Simplification of merged geographies in England and Wales • Calculation of ‘missing’ data • Linkage between descriptive and geography models • Partial availability of data for geographies and extents
Admin and statistical geography layers infuse.mimas.ac.uk/help/definitions/2011geographies
What’s next for InFuse • Interface improvements • Geography first option • Fine tune interface features • Select categories from more than one category combination • ‘Select all’ categories • Back button • Geography tree improvements (multiple hierarchies) • User testing
What’s next? • More data • More comparable data • Different data • Boundary and flow data • More functionality • Personalisation, analysis and visualisation • Public InFuse API • Work with statistical agencies? • Machine-friendly data from source • Flexible generation with automated disclosure control? • Information on usage and contact with users
What is the UK Data Service? • a comprehensive resource funded by the ESRC • a single point of access to a wide range of secondary social science data • support, training and guidance
UK Data Service Census Support • Specialist function of UK Data Service • Access and support services for outputs from recent UK censuses • Add value by making census outputs easy to find, understand and use • Engagement with UK census agencies • Long history of technological innovation in service development • census.ukdataservice.ac.uk
Census Support at Manchester • Aggregate component of census outputs Justin Hayes Richard Wiseman Rob Dymond-Green Jamey Hart
Census Support at Manchester • Aggregate component of census outputs Justin Hayes Richard Wiseman Rob Dymond-Green Jamey Hart
Give InFuse a go! infuse.mimas.ac.uk • Comments, questions and ideas welcome • help@ukdataservice.ac.uk