1 / 26

What is ESDS?

Using ESDS data in Linguistics and NLP Dr. Kakia Chatsiou ESDS/UK Data Archive Language and Computation Group Day 07 Oct 2011 http://lac.essex.ac.uk/lacday2011. What is ESDS?. E conomic and S ocial D ata S ervice national data archiving and dissemination service (since January 2003)

Download Presentation

What is ESDS?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using ESDS data in Linguistics and NLPDr. Kakia Chatsiou ESDS/UK Data ArchiveLanguage and Computation Group Day 07 Oct 2011http://lac.essex.ac.uk/lacday2011

  2. What is ESDS? • Economic and Social Data Service • national data archiving and dissemination service (since January 2003) • access and specialist support for key economic and social data resources to UK Higher and Further Education users • brings together centres of expertise in data creation, dissemination, preservation and use in Manchester and Essex • managed by the UK Data Archive (established in 1967); jointly supported by Economic and Social Research Council (ESRC) & Joint Information Systems Committee (JISC)

  3. http://www.esds.ac.uk

  4. ESDS in numbers • 6,000 datasets in the collection • 230 new datasets added each year • over 22,000 registered users • approximately 60,000 downloads worldwide p.a. • 3,000+ user support queries

  5. Data collections we hold Through our dedicated services we provide access to: • surveys • government data • aggregate statistics • censuses • international data • longitudinal data • qualitative data - multimedia data sources • historical data

  6. ESDS Linguistics data offers • From ESRC grants • 19 accepted • rest unable to accept (due to confidentiality or size reasons) or referred to more suitable archives (e.g. Oxford Text Archive, CHILDES/Talkbank database) • increase in depositing after researcher self-archive (UKDA-Store) launch

  7. ESDS data holdings on linguistics & related fields • 40 main catalogue data collections with language and linguistics subject category, accessible from the main ESDS Data Catalogue (14 qualitative, 18 quantitative, 8 historical) • all qualitative studies comprising of in-depth interview transcripts or audio recordings can be used as corpus material or data sources for secondary analysis e.g. Family Life And Work Experience Before 1918 (Edwardians) (SN 2000), Pioneers interview collections • 13 UKDA-Store data collections with ‘linguistics’ as the primary discipline.

  8. Examples of ESDS data collections with subject term “Language and Linguistics”

  9. Examples of linguistics data holdings in UKDA-Store

  10. Linguist users of ESDS data • 51 self-reported linguists (out of around 22,000) • about 30 of these downloaded ESDS data, the majority of them being survey data, then qualitative interviews and a few historical data downloads • the rest might well have accessed documentation, study methods and instruments about studies (but since these do not require registration, we cannot report usage)

  11. How linguists have used ESDS data • a researcher and their team based at the University of Sheffield used 2 audio collections for analysis of speech patterns (SN2000 - Edwardians, SN5407- Health And Social Consequences Of The Foot And Mouth Disease Epidemic In North Cumbria) • an ESRC joint project between the UK Data Archive and the Language Processing team at the University of Edinnburgh used three classic social science collections to test natural language processing tools. They looked at named entity recognition on typical social science data interviews. Person-based identification enabled the testing of an anonymisation tool.

  12. ESDS data uses by Linguists • a JISC project between EDINA and the UK Data Archive using the HISTPOP collection at the UK Data Archive to augment resource search and discovery methods. • data and metadata were fed to GeoDigRef and LTG GeoParser • the enriched data were embedded in an experimental geographical service by EDINA • allows users to search resource collections via a map-based interface, which provides links back to the reference of the place-name in the original resource

  13. That sounds interesting!Where to look for relevant data ? ESDS data catalogue (homepage) Some of these options can be used to find data: • search the ESDS Catalogue (simple or advanced search) • search variables • browse Major Studies list • browse the latest releases

  14. Finding data: Searching the Data Catalogue

  15. Finding data: Sample data catalogue record

  16. Finding Data: Sample Documentation

  17. Where to find more data

  18. Finding Data: our researcherself-archiving UKDA-Store

  19. Accessing data • Documentation is freely available to anyone • Users must be registered with ESDS to download access data • You can use your university username & password to register • Access to some data is limited to users at UK Higher or Further Education Institutions • Currently have approx. 22,000 registered users

  20. How to access data • register with ESDS • agree to the terms & conditions of the End User Licence • select the dataset from the Data Catalogue and click ‘Download/Order’ • specify a usage/project for which the data are to be used • then: • download data selecting your preferred format (SPSS, Stata, TAB etc.) or • place an online order for the data • for more see http://www.esds.ac.uk/support/e2.asp

  21. How to access data

  22. Teaching resources • ESDS can help provide support in many areas of teaching and research methods • teaching datasets • thematic guides, e.g. on health and crime • guides on: • data collection and use • data sharing and data management • confidentiality, consent and ethics issues • survey and research design and analysis • software for analysing data • case studies of re-use • training events and workshops • recently involved in creating formal assessments based on Qualitative data collections (TALIF grant with Dept of Sociology, Essex)

  23. Workshops and training Thematic data resources events Help with using data specific datasets data handling skills methodological issues analytical skills - introductory and advanced level We are pro-active and re-active, so ask us, if you want to have a workshop! Forthcoming events: http://www.esds.ac.uk/news/esdsforthevents.asp

  24. Other UK Data Archive services

  25. Thank you! Questions?

  26. References • Corti, Louise. (2011, 11 Jan). Report on Linguists’ use of ESDS. ESDS/UK Data Archive.

More Related