1 / 29

e-Science, Data Management and Frontiers in Survey Research

Paul Lambert, 24-25 November 2010 Talk to the ‘Documentation and Workflows for Social Survey Research’ training workshop, part of the Data Management through e-Social Science ESRC research Node www.dames.org.uk. e-Science, Data Management and Frontiers in Survey Research.

Download Presentation

e-Science, Data Management and Frontiers in Survey Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paul Lambert, 24-25 November 2010 Talk to the ‘Documentation and Workflows for Social Survey Research’ training workshop, part of the Data Management through e-Social Science ESRC research Node www.dames.org.uk e-Science, Data Management and Frontiers in Survey Research DAMES, 24-5/NOV/2010

  2. Part 1: E-Social Science / Digital Social Research ESRC & JISC initiatives a major UK investment in ‘e-social science’ technology (see www.digitalsocialresearch.net) • Handling and displaying large volumes of complex data • E.g. GeoVue; LifeGuide; DReSS; Obesity e-lab • Resources for computationally demanding analyses • CQeSS; MoSeS; eStat; NeISS • Standards setting in collaboration, data preparation, data and research support – DRS; MeRC; OeSS;DAMES • US - ‘Cyberinfrastructures’; EU - ‘EUGrid’ DAMES, 24-5/NOV/2010

  3. Example: Data on occupations, educational qualifications and ethnicity (ww.dames.org.uk) • Linking complex data • Metadata • Security • Workflows DAMES, 24-5/NOV/2010

  4. Example: Understanding New Forms of Digital Records (DReSS) • transcribed talk • audio • video • digital records • system logs • location video code tree transcript system log DAMES, 24-5/NOV/2010

  5. ..more examples.. E-Stat @ National e-Infrastructure for Social Simulation Expert led simulation demonstrations Combining data resources Workflows for the simulation analysis Modify and re-specify existing simulation templates www.neiss.org.uk • Design a tool to specify complex statistical models in generic / visual terms • Multilevel models • Multiple data permutations and analytical alternatives • Ready access to a suite of complex modelling tools • www.cmm.bristol.ac.uk/research/NCESS-EStat/ DAMES, 24-5/NOV/2010

  6. Other selected e-Science projects (concerned with accessing/handling complex data) DAMES, 24-5/NOV/2010

  7. E-Science as ‘dealing with data’ • Collect/access vast quantities of data • Complex surveys & comparisons • Admin. & other data resources • Guarding secure microdata (e.g. health records; intv. transcripts) • Data management and Data Analysis • Standards setting • Exciting new facilities/analyses • {Global} communication and collaboration …amongst researchers and data resources… DAMES, 24-5/NOV/2010

  8. The relevance of e-Science to data management • ‘Data management through e-Social Science’ • ‘E-Science’ refers to adopting a number of particular approaches and standards from computing science, to applied research areas • These approaches include ‘the Grid’; distributed computing; data and computing standardisation; metadata; security; research infrastructures • UK investment in capitalising on these developments • DAMES (2008-11) – developing services / resources using e-Science approaches which will help social scientists in undertaking data management tasks DAMES, 24-5/NOV/2010

  9. E-Science and Data Management E-Science isn’t essential to good DM, but it has capacity to improve and support conduct of DM… • Concern with standards setting in communication and enhancement of data • Linking distributed/heterogeneous/dynamic data Coordinating disparate resources; interrogating live resources • Contribution of metadata tools/standards for variable harmonisation and standardisation • Linking data subject to different security levels • The workflow nature of many DM tasks DAMES, 24-5/NOV/2010

  10. GESDE – Search and browse supplementary data on occupations; educational qualifications; ethnicity DAMES, 24-5/NOV/2010

  11. The contribution of DAMES8 project themes DAMES, 24-5/NOV/2010

  12. Storage Storage HPC HPC Social scientist Social scientist Social scientist Data Data Experiment Computing Analysis Analysis Tapping in: Portals & e-Infrastructural overviews slide from Peter Halfpenny (2009), see www.merc.ac.uk Seamless integration of data, analytic tools and compute resources Grid Middle- ware Simple interface Single sign on e-Infrastructure

  13. Tapping into the e-InfrastructureLong, arduous road from innovation to seamless service delivery.. • ‘Working together’: computer- & social- science collaborations • The ‘social shaping’ of e-Science – www.oii.ox.ac.uk/microsites/oess/ • Teamwork, ‘divisions of knowledge’, separation of data and analysis, are all routine [cf. Mauthner & Doucet, 2008] • Ability to engage with advanced information • e.g. social simulation; network locations [cf. Prior 2008] • classic sociology – class, ethnicity, social structures • new technological opportunities – e.g. public health projects • Requirements of existing tools and services …advanced quantitative methods [cf. Williams et al 2008; 2004] …patience, & some O.S. facility..! …selective access to technologies [by researchers – cf. Murthy, 2008] DAMES, 24-5/NOV/2010

  14. The researched: Ethics, security, anxiety • Fair ethical scrutiny of e-Science research e.g. secure access to health data Oxford e-Social Science Node on e-research ethics • Residual anxieties Is e-Science data effectively covert? • Informed consent & overt/covert continuum[cf. Calvey, 2008] The voice of the researched? E-Infrastructure overheads as gatekeepers {at present} Managing mass engagement – e.g. of Lifeguide as prescriptive? DAMES, 24-5/NOV/2010

  15. Part 2) Frontiers in social survey research? • The changing terrain of social survey research and four exciting developments/frontiers: • Data access • Data management • Data analysis • Log books DAMES, 24-5/NOV/2010

  16. 1) Access to data.. Example: Accessing surveys via UK Data Archive Shibboleth authentication Download and analyse in Stata, SPSS, etc

  17. Complex data example: British Household Panel Survey dataset [SN 5151] • This example shows BHPS being analysed in Stata. BHPS re-contacts subjects annually (since 1991) • 4294 interviewed as adults every year for 17 years. • Analysis methods, and measurement issues over time, are challenging.

  18. Large and complex social surveys • several thousand variables • tens of thousands of cases (micro-data) • additional complex survey data features (e.g. household clustering) DAMES, 24-5/NOV/2010

  19. Supplementary (digital) data • E.g. ‘Occupational information resources’ = data files within information on occupations, which can be usefully linked to micro-data about occupations e.g. GEODE acts as a library of OIRs, www.geode.stir.ac.uk Such resources are often not widely known about, but have the ability to enhance analysis DAMES, 24-5/NOV/2010

  20. Steady accumulation of options / permutations / approaches in… • Data Management • Pre-analysis (and re-analysis) routines • Sensitivity analysis • Standardisation, harmonisation • Data Analysis • Descriptive tools • Ongoing development of complex analytical models • GLMMs for structural data features, multi-process systems, etc DAMES, 24-5/NOV/2010

  21. 4) Log books • Software tools for logging work are increasingly well developed See our ‘software session 1’ description • Other initiatives in sharing records of work • E-Stat: Electronic workbooks for the data and model building process • MyExperiment: Depository for project files These haven’t yet been extensively exploited in survey research – but they should be! DAMES, 24-5/NOV/2010

  22. Well-known challenges in survey research • We’re data rich, but analysts’ poor • UK Data Forum (2007); Wiles et al (2009) • Under-use of suitably complex statistical models • Coordination and communication on data processing • Recodes / Standardisation / harmonisation / documentation • Lack of generic/accessible representation of tasks • Limited disciplinary/project/researcher cross-over when dealing with data • Specific software orientations These are not generally problems of scale, but of organisation DAMES, 24-5/NOV/2010

  23. ‘Managed’ solutions? • Data handling/analysis capacity-building ESRC programmes (NCRM, RDI, RMP); training workshops/materials; P/G funds; strategic research grant investment • Documentation/replication policies Dale (2006) • Software for data access and analysis NESSTAR – UK Data Archive data/metadata browser Long (2009) on the Stata software Remote access to data (e.g. SDS)

  24. ..train and/or constrain the analysts.. Train them -> DAMES, 24-5/NOV/2010

  25. ..constrain the analysis.. DAMES, 24-5/NOV/2010

  26. Summary • E-Science would often be seen as about enabling effective research in conditions of abundant resources • In practical terms, for survey researchers, this means navigating through the vast array of data and analytical resources, and undertaking defensible and replicable work.. DAMES, 24-5/NOV/2010

  27. A preposterous conclusion… e-Science adoption and the Industrial revolution…? • Landes (1969) The Unbound Prometheus • Knowledge-based revolution • Importance of standardising technology for cooperation (not just creating it) • Importance of having access to underlying materials – coal, cotton, etc. • Uneven development (nationally) Landes, D. S. (1969). The Unbound Prometheus: Technological Change and Industrial Development in Western Europe from 1750 to the Present. Cambridge: Cambridge University Press. DAMES, 24-5/NOV/2010

  28. Cardiff’s two transformations Images from: www.lovemywales.com/history.php Cardiff docks c1850 Cardiff docks c2005

  29. ReferencesAcknowledgements: The ESRC has funded research into e-Social Science via the NCeSS, www.ncess.ac.uk and Digital Social Research http://www.digitalsocialresearch.net/ groups and their related Nodes and grant projects. • Calvey, D. (2008). The Art and Politics of covert research: Doing 'situated ethics' in the field. Sociology, 42(5), 905-918. • Dale, A. (2006). Quality Issues with Survey Research. International Journal of Social Research Methodology, 9(2), 143-158. • Freese, J. (2007). Replication Standards for Quantitative Social Science: Why Not Sociology? Sociological Methods and Research, 36(2), 153-171. • Halfpenny, P. (2008, 30 June - 3 July). What is.. e-Social Science. Paper presented at the ESRC NCRM Research Methods Festival, St Catherine's College, University of Oxford. • Lambert, P. S., & Gayle, V. (2009). Data management and standardisation: A methodological comment on using results from the UK Research Assessment Exercise 2008. Stirling: University of Stirling, Technical paper 2008-3 of the Data Management through e-Social Science research Node (www.dames.org.uk) • Long, J. S. (2009). The Workflow of Data Analysis Using Stata. Boca Raton: CRC Press. • Mauthner, N. S., & Doucet, A. (2008). 'Knowledge once divided can be hard to put together again'. Sociology, 42(5), 971-985. • Murthy, D. (2008). Digital Ethnography: An examination of the use of new technologies in social research. Sociology, 42(5), 837-855. • Prior, L. (2008). Repositioning documents in social research. Sociology, 42(5), 821-836. • Savage, M., & Burrows, R. (2007). The coming crisis of empirical sociology. Sociology, 41(5), 885-899. • UK Data Forum. (2007). The National Strategy for Data Resources for Research in the Social Sciences. Warwick: University of Warwick, http://www2.warwick.ac.uk/fac/soc/nds/ (Accessed 18 June 2007). • Wiles, R., Bardsley, N., & Powell, J. L. (2009). Consultation on research needs in research methods in the UK social sciences. Southampton: University of Southampton / ESRC National Centre for Research Methods, and http://eprints.ncrm.ac.uk/810/ • Williams, M., Collett, T., & Rice, R. (2004). Baseline Study of Quantitative Methods in British Sociology. University of Plymouth: C-SAP Project report to the British Sociological Association. • Williams, M., Payne, G., Hodgkinson, L., & Poade, D. (2008). Does British Sociology Count. Sociology, 42(5), 1003-1021. DAMES, 24-5/NOV/2010

More Related