120 likes | 210 Views
DAMES: Data Management through e-Social Science. e-Science approaches in DAMES Simon Jones Department of Computing Science and Mathematics University of Stirling. Rationale. We aim to investigate and develop:
E N D
DAMES: Data Management through e-Social Science e-Science approaches in DAMES Simon JonesDepartment of Computing Science and MathematicsUniversity of Stirling DAMES - Data Management through e-Social Science
Rationale We aim to investigate and develop: • ‘e-Infrastructure’ services targeted to data management requirements across a rich range of social science data resources • An internet ‘portal’ that will make available a variety of specific data resources as Grid services • augmented with portfolios of tools for supporting the processes involved in data management DAMES - Data Management through e-Social Science
Approaches (on-going!) • Social science data resources are distributed, disaggregated and heterogeneous • Metadata description • Semantically-based discovery • Data abstraction/virtual fusion • Easy but secure access is required • "Virtualization"/"fusion", workflow support for SS • Fine grained authorisation infrastructures DAMES - Data Management through e-Social Science
Meta-data support • Existing metadata standards have been assessed • Data Documentation Initiative, DDI • Statistical Data and Metadata eXchange, SDMX • UK Data Archive • Nesstar • Focussing continuing work on exploiting DDI3 • DAMES must engineer compatibility with currently used metadata aproaches DAMES - Data Management through e-Social Science
Semantically based data discovery • To extend data discovery through metadata DAMES will develop techniques for data discovery through semantic queries • OWL ontology framework: to give meaning to data resources • OWL-S: for developing semantic grid services • DAMES will support a Grid service for registration and discovery of data resources using semantic grid techniques DAMES - Data Management through e-Social Science
Data from heterogeneous sources • Data abstraction can help with heterogeneity: • Support for accessing data content without regard to detailed representation • Metadata support is essential • Extending current work using OGSA-DAI to deal with a wider variety of SS formats: e.g. SPSS, Stata • A Grid service to give uniform access to underlying data DAMES - Data Management through e-Social Science
Data from multiple sources • Related data sources may need processing as if combined • The sources may be distributed and heterogeneous • DAMES will investigate "virtual fusion" techniques • Leveraging data abstraction and effective metadata support • Uniform query processing Grid services will be developed • Related to DQP DAMES - Data Management through e-Social Science
Support for e-Social Science:Workflows • This research will focus on adapting and extending workflow modelling approaches • BPEL, ebXML, Taverna, WHIP • Typical social science applications will be supported by workflows, e.g. • Occupational analysis, census analysis, social care • A visual design tool will be developed for defining new workflows in e-Social Science • Integrated into the DAMES Portal • With execution support DAMES - Data Management through e-Social Science
GEODE: Grid Enabled Occupational Data Environment • Previous SS/CS collaboration at Stirling • Occupational scheme linking is a common practice for researchers • Geode enables a virtual community of occupational information researchers • Portal gateway for occupational information • Data abstraction • Uniform access to resources • Occupational matching services • Demonstrates viability of the DAMES approach DAMES - Data Management through e-Social Science
GEODE prototype • Windows environment • Java • GridSphere Portal Framework • Globus Toolkit 4 • Index Service (Virtual Organization) • OGSA-DAI WSRF (Data Access Middleware) • Custom OGSA-DAI resources and activities • Accesses CSV, Relational data resources DAMES - Data Management through e-Social Science
Example: Grid Enabled Occupational Data Environment (GEODE) DAMES - Data Management through e-Social Science
Summary • Distributed, disaggregated, heterogeneous data sources need: • Metadata • Semantically-based discovery • Data abstraction/virtual fusion • Specialised SS workflows • Security (later in workshop) • GEODE gives a springboard for GE*DE DAMES - Data Management through e-Social Science