1 / 39

Methodbox: From open-data to open-insight

Methodbox: From open-data to open-insight. MethodBox Team Jul 2011. Presentation. Problem Data tsunami + puddles of insight Solution Collective efficient science Deployment Sense-making networks on open-data. Quote.

verdi
Download Presentation

Methodbox: From open-data to open-insight

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Methodbox:From open-data to open-insight MethodBox Team Jul 2011

  2. Presentation • ProblemData tsunami + puddles of insight • SolutionCollective efficient science • DeploymentSense-making networks on open-data

  3. Quote “…you call it Epidemiology and we call it quantitative Social Science” A leading researcher, Jul 2011 Open dataCommon methodsPotentially complementary expertise

  4. Obesity Example Fragmented understandingof public health problems such as obesity...data, methods/models and expertisesplit acrossdisciplines (e.g. social vs. biomedical)and settings (e.g. academia vs. healthcare)

  5. Puddles of research around the organising principle … but policies need the big picture

  6. Data Example • Time series data from Health Visitors from Wirral • Data deposit with UKDA but no uses for 16 years • Children measured at the time the obesity epidemic took hold…

  7. Material deprivation affecting children (households with children: % on benefits in 2001-3) Wirral (0.3M), UK Fifths of IDAC 2004 Red (light) = most deprived Red (dark) Purple Blue (dark) Blue (light) = most affluent

  8. BMI of 3 yr olds 1988 - 1989 Fifths of BMI SDS BMI fifth Red (light) = fattest Red (dark) Purple Blue (dark) Blue (light) = thinnest

  9. BMI of 3 yr olds 1990 - 1991 Fifths of BMI SDS BMI fifth Red (light) = fattest Red (dark) Purple Blue (dark) Blue (light) = thinnest

  10. BMI of 3 yr olds 1992 - 1993 Fifths of BMI SDS BMI fifth Red (light) = fattest Red (dark) Purple Blue (dark) Blue (light) = thinnest

  11. BMI of 3 yr olds 1994 - 1995 Fifths of BMI SDS BMI fifth Red (light) = fattest Red (dark) Purple Blue (dark) Blue (light) = thinnest

  12. BMI of 3 yr olds 1996 - 1997 Fifths of BMI SDS BMI fifth Red (light) = fattest Red (dark) Purple Blue (dark) Blue (light) = thinnest

  13. BMI of 3 yr olds 1998 - 1999 Fifths of BMI SDS BMI fifth Red (light) = fattest Red (dark) Purple Blue (dark) Blue (light) = thinnest

  14. BMI of 3 yr olds 2000 – 2001 Fifths of BMI SDS BMI fifth Red (light) = fattest Red (dark) Purple Blue (dark) Blue (light) = thinnest

  15. BMI of 3 yr olds 2002 - 2003 Fifths of BMI SDS BMI fifth Red (light) = fattest Red (dark) Purple Blue (dark) Blue (light) = thinnest

  16. Child Obesity:Action 6 years after signal in the data Body Mass Index (BMI) trend in Wirral 3y-olds from 1988 to 2003 0.5 0.4 0.3 0.2 0.1 Three-monthly rolling average BMI SDS 0 Actions -0.1 Clues -0.2 -0.3 -0.4 Mar-88 Jul-89 Nov-90 Apr-92 Aug-93 Jan-95 May-96 Sep-97 Feb-99 Jun-00 Nov-01 Mar-03 Aug-04 Month of measurement by Health Visitor SDS = standard deviation score from 1990 British Growth Reference charts – adjusts for age and sex of the child

  17. Similar Data in 2011 • National Child Measurement Programme • Anonymised national database • Could be opened (like national pupil database)  extend to other policy-relevant, timely research

  18. Data Already in UK Data Archive • Example: Health Surveys for England (annual) • Analyses feed national policies • Does evidence need to be localised?...

  19. Women and not menfrom low-income households are fatter in England 27.5 27 26.5 BMI 26 25.5 25 Women 1 2 Men 3 4 5 Income fifth (low to high) Data from Health Survey for England

  20. Women from low-income households and men from high-income householdsare fatter in Greater Manchester 27.5 27 26.5 BMI 26 25.5 25 Women 1 2 Men 3 4 5 Income fifth (low to high) Data from Health Survey for England

  21. Linked-data ≠Linked: data, methods & investigators Social Research: Data, methods & investigators Biomedical Research: Data, methods & investigators Previous slides showsocial-biomedical signalsabout obesityfrom under-used datasets

  22. MethodBox Aim ..to increase the sharing and reuse ofdata sources & extractsand data processing methodsin one in-silico environment (‘e-Lab’)shared by social and health researchers

  23. e-Lab Research Object Research protocol Statistical analysis scripts Data-sources Analysis-logs & notes Find Share Reuse Data-preparation scripts Figures/Graphics Working datasets Manuscripts References Slides Socially-stimulating science, in-silico

  24. National Dataset Example • Health Surveys for England • Large-scale (participants * variables) • Annual since early 90s • Under-used by NHS who fund it • Key barrier:extracting a research-ready subset of data • Data archive  playground = e-Lab

  25. Supporting and Developing Interdisciplinary Understanding Sharing resources – tools, methods, data First step - sharing of resources Shared resources provide the basis for discussion Discussions lead to deeper interdisciplinary understanding Understanding of other domains promotes more effective interdisciplinary working Sharing expertise – discussions and reuse around shared resources Developing interdisciplinary understanding – language, tacit assumptions, methods Promoting interdisciplinary working

  26. Facilitating a social networkof data archive users……toward a reward environmentfor sharing data, methods,and expertise

  27. Browsing for data extractsmade by a social networkof data archive users…

  28. Shopping for variables from across different years of survey collections…

  29. Instant access torelevant parts ofsurvey documentation…

  30. Sharing and visibility Linking a data extractwith a script forderiving variables… Making the data extractvisible…

  31. Enabling user-visibility for data extraction or derivation contributions…

  32. Current MethodBox Video link

  33. Training Course Apr `10 • Trained a mixture of NHS, academic and industry users of HSE in the use of Methodbox • Course run in conjunction with CCSR • Feedback forms completed by 15 of 16 attendees, asked to rate Methodbox from 1 (negative) to 7 (positive) on the following statements: • I thought MethodBox was: • Terrible - Wonderful:  Mean = 5.57 • Difficult to understand - Easy = 5.57 • Frustrating to use - Satisfying = 5.79 • Dull - Stimulating = 5.29 • Rigid - Flexible = 5.71 • Difficult to navigate - easy to navigate = 6

  34. Attitudes to Sharing

  35. MethodBox Evolution • Amazon-like user-prompting forother variables that may be relevantto the set being extracted • More surveys/datasets incorporated • User-contributed & community-curated datasets • …. • Feature request list exceeds resources

  36. Building on Successful E-Science • Most widely used scientific workflow sharing systems: myGrid, Taverna, myExperiment • Over a decade of programme funding sustained  world leading • E-Infrastructure R&D ready to leverage more outputs from open-linked data

  37. Toward Open Insight • Researcher A is expert in deprivation • Researcher B is expert in obesity • Both use a common data archivebut don’t usually meet • MethodBox shares the expertise of A and Bto create a more complete model of deprivation in obesity

  38. Conclusion • Open-data alone is not enough • Social e-infrastructure for science is needed • Sharing insights and methods is key, and can be achieved through systems like MethodBox + ESDS

More Related