250 likes | 346 Views
e-Infrastructure for Social Science data: Obesity e-Lab & MethodBox. Ian Dunlop 15/03/11 ian.dunlop@manchester.ac.uk. Terminology. Obesity e-Lab is the ESRC project www.obesityelab.org.uk MethodBox is the product. www.methodbox.org. Obesity e-Lab Aims.
E N D
e-Infrastructure for Social Science data:Obesity e-Lab & MethodBox Ian Dunlop 15/03/11 ian.dunlop@manchester.ac.uk
Terminology • Obesity e-Lab is the ESRCproject www.obesityelab.org.uk • MethodBox is the product www.methodbox.org
Obesity e-Lab Aims • Enable socially networked research between the social sciences, health sciences and public health • Add value to archived datasets by developing technologies to help on-line users • Seed an “open source” approach to social research publication
Project Objectives • Engagement (‘More with less’) • Research communities (Obesity/Cancer, Education) • Public health researchers (Academic, NHS, LA) • Key data providers (ESDS/UKDA) • Reduce barriers • For survey datasets • Formation of research communities (cross-disciplinary) • Develop tools • On line digital laboratory an ‘e-Lab’ known as MethodBox • Data * Methods * People
e-Lab Research Object Research protocol Statistical analysis scripts Data-sources Analysis-logs & notes Find Share Reuse Data-preparation scripts Figures/Graphics Working datasets Manuscripts References Slides Socially-stimulating science, in-silico
Where we are upto • MethodBox launched at ESDS government event April 2010(scored 5.7/7 from 15 responses) • 80 registered users, 45 scripts and 58 data extracts. • 21 public health researchers trained using a combination of social science and health science approaches • Methodological approach adopted by North West e-Health (www.nweh.org.uk) project (which is 20x bigger than us)
Context, Features, Architecture • Context • Investigation Cycle • Survey (Meta) Data overload • How MethodBox fits it • MethodBox • Architecture • Screenshots • E-Infrastructure • Future Directions
Investigation Cycle Results Tooling Community Data Analysis Models Publications, Reports or Decisions Questions • Our Tooling focus is (survey) Data and Analysis • Out main Community focus is Expertise via Methods/Analysis/Scripts
Examples: HSE 2006 @1800 Variables X 17 All HSE 208 pages 13 pages Questionnaire Instructions Survey Description 148 pages 224 pages Questions used To set variables Variable Definitions Variable Categories Variable SPSS code Variable Value Domains 9 pages 351 pages 46 MB data files Data and Variable Codebook
How MethodBox fits in MethodBox Survey Navigation Economic and Social Data Service (ESDS) Survey Mapping Improving Access & Use UK Data Archive (UKDA) Survey Curation Survey Commissioning & Collection etc… diagram not to scale
Ruby delayed job User Dataset import File system Ruby on Rails User data and metadata import Request ‘catalog’ information Metadata import Data providers mySQL Provide metadata
Sharing and visibility Linking a data extractwith a script forderiving variables… Making the data extractvisible…
MethodBox as e-Infrastructure • Data Providers • Existing infrastructure (NESSTAR/NESSTAR Server) • Cautious • adopt only ‘proven’ technologies • Willing ‘try’ things if risk/work is low • MethodBox offers • Social Layer, sharing, data tooling • Integration • Existing data provider infrastructure – NESSTAR Server • Security infrastructure (Shibboleth) • Automated running of scripts for new datasets (using institutional/national compute) • Deployment • ESDS/CCSR first instance (exit strategy) • Obesity e-Lab project ends 31/03/12
Future work • MethodBox as e-Infrastructure • Target deployment as part of ESDS/CCSR • Integration with NESSTAR system • Focus on communities • Greater Manchester Public Health Inequalities Research Network • University of Manchester School of Education • North West e-Health and Arthritis Research UK • Ability to ‘run’ methods • Part funded by Obesity e-lab work in JISC ‘National e-Infrastructure for Social Simulation’ project video at http://bit.ly/methodbox11