1 / 16

geode.stir.ac.uk

GEODE: Grid Enabled Occupational Data Environment Paul Lambert and Larry Tan University of Stirling. www.geode.stir.ac.uk. ‘The Grid’ and New Technologies of Data Collection. ‘The Grid’ and ‘eScience’: Online Coordination of electronic resources and collaborations (Distributed computing)

calum
Download Presentation

geode.stir.ac.uk

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GEODE: Grid Enabled Occupational Data EnvironmentPaul Lambert and Larry TanUniversity of Stirling www.geode.stir.ac.uk GEODE - NeSC workshop, Oct 2006

  2. ‘The Grid’ and New Technologies of Data Collection ‘The Grid’ and ‘eScience’: • Online Coordination of electronic resources and collaborations • (Distributed computing) • Large scale • Collaborative • Heterogeneous • Standard protocols / information management systems UK eSocial Science: • Investment in assessing / implementing technology • Computationally demanding data analysis • Qualitative and quantitative data collection technologies • **Data sharing, processing and access** GEODE - NeSC workshop, Oct 2006

  3. GEODE: Survey records’ occupational data The importance of occupational micro-data Collecting occupational data • Initial occupational records (textual description) • Processing occupational records: Good practice: • Preservation of original, OUG and substantive variables • NSI’s favour transparent occupational data coding (1) and translation systems (2) • Text descriptions • →(1) Standardised Occupational Index (e.g. unit group: OUG) • →(2) Substantive occupational summary (e.g. social class code) GEODE - NeSC workshop, Oct 2006

  4. (1) Text records → OUG data Currently: Text coding software (e.g. CASCOT) Manual look-up GEODE: Linkage to existing resources Further facilities possible but not planned (users typically have adequate resources) (2) OUG data → summary indicators Currently: Numerous aggregate occupational information resources Bespoke data programming requirements GEODE: Core provision: management and access of these data resources Service to large volumes of users Occupational data collection and processing GEODE - NeSC workshop, Oct 2006

  5. Some illustrative occupational information resources GEODE - NeSC workshop, Oct 2006

  6. What’s the problem? Indexed mainly by Occupational Unit Group (OUG). But… • Numerous alternative occupational data files (time; country; format) • Alternative OUG schemes; other index factors (‘employment status’) • Inconsistent translations to social classifications – ‘by file or by fiat’ • Dynamic updates to occupational data resources • Low uptake of existing occupational information resources • Strict security constraints on users’ micro-social survey data GEODE - NeSC workshop, Oct 2006

  7. GEODE: Grid Enabled Occupational Data Environment Strategy: • Occupational data index service (depository) • Semantic data curation (DDI) • Data storage (OGSA-DAI) • Data indexing / access (OGSA-DAI) 2) User-friendly ‘portal’ access • Entry to an international virtual organisation for data depositors and users (GridSphere, GT4, OGSA-DAI) • Facilitate linking occupational information to users’ datasets (OGSA-DAI) (initial focus on CAMSIS resources) GEODE - NeSC workshop, Oct 2006

  8. Occupational information depository 1.1) Semantic curation of occupational information • Establish a ‘GEODE-M’ meta-data subset (.xml) • Founded on Michigan Data Documentation Initiative • Minimise curation requirements • Web proforma entry • [via Portal using Gridsphere] GEODE - NeSC workshop, Oct 2006

  9. Technical Objectives • Create a virtual community of occupational information researchers • Gateway for occupational information • Data abstraction • Uniform access to resources • Accessible via a portal • Occupational data curation • Annotation of data using DDI • Occupational matching services • e.g. Linking surveyed data to CAMSIS scores GEODE - NeSC workshop, Oct 2006

  10. GEODE - NeSC workshop, Oct 2006

  11. GEODE - Architecture • VO members can deploy own data services, also occupational matching services • Scalable • Distributed • Possible application for other types of social science data • Annotation with DDI • Custom services can be deployed GEODE - NeSC workshop, Oct 2006

  12. GEODE – Prototype • Simple occupational matching services • VO of Occupational Data Resources • Portal for searching external resources GEODE - NeSC workshop, Oct 2006

  13. GEODE - Prototype GEODE - NeSC workshop, Oct 2006

  14. GEODE - Prototype • Windows environment • Java • GridSphere Portal Framework • Globus Toolkit 4 • Index Service (Virtual Organization) • OGSA-DAI WSRF (Data Access Middleware) • Custom OGSA-DAI resources and activities • Accesses CSV, Relational data resources GEODE - NeSC workshop, Oct 2006

  15. GEODE - Prototype • Data Documentation Initiative • Annotate the data resources • Occupational Matching Grid Services • Checks if DDI of target resource is compatible (e.g. category specified matches requirement) • Map occupational unit group to data • Returns mapped/matched results • Demonstration of prototype GEODE - NeSC workshop, Oct 2006

  16. Future Work • Possible extension of VO to other social science related datasets • With services • Variety of occupational data analysis services GEODE - NeSC workshop, Oct 2006

More Related