440 likes | 595 Views
Social Science Datasets and Digital Resources. http://www.slideshare.net/johnkayebl. Overview. British Library Datasets Strategy UK Data Service Census Resources Spatial Data Open Data UK Web Archive Other Data and Resources Tools, Software and Visualisation
E N D
Social Science Datasets and Digital Resources http://www.slideshare.net/johnkayebl
Overview • British Library Datasets Strategy • UK Data Service • Census Resources • Spatial Data • Open Data • UK Web Archive • Other Data and Resources • Tools, Software and Visualisation • Identifying, Citing and Sharing Data
What is a dataset? • Seismic measurements taken by a geologist. • Genetic data collected by a medical researcher. • A survey of public opinions collected by a sociologist. • A collection of tweets about events
The Foundation for Research • Data is a crucial component of the scholarly record. • Re-acquisition may be impossible • Datasets are essential to the British Library’s mission to advance the World’s knowledge.
The British Library Datasets Strategy We envision a future where researchers can: • Discover, access, reuse, and reference datasets. • Track the impact of the data that they generate and receive appropriate credit. Our approach is to: • Provide a focus for the community to establish needs, requirements and agreement. • Explore novel technology and creative solutions.
UK Data Service http://ukdataservice.ac.uk/ • Data search and download • Research method guides • Thematic guides • Online analysis • Secure Data Service http://securedata.data-archive.ac.uk/ • Administrative Data Service
UK Data Service • Government • large-scale government surveys, such as the Labour Force Survey and the General Household Survey • International • multi-nation databanks, such as World Bank's World Development Indicators, and survey data including Eurobarometer • Longitudinal • major UK surveys following individuals over time, such as the British Household Panel Survey and Birth Cohort Studies • Qualidata • a range of multimedia qualitative data sources • new portal (UK Quali Bank) to be launched Dec 2013
2011 Census • Data available on www.ons.gov.uk - latest release is output area key statistics • Academic releases 1971 - 2011 are made available via http://census.ukdataservice.ac.uk/ • Experian Geodemographic Data http://cdu.mimas.ac.uk/experian/index.htm
Previous Censuses • Data available for 1981-2011 on http://www.nomisweb.co.uk/ • Academic data release from 1971 to 2011 on casweb (also contains geographic boundary data) http://census.ukdataservice.ac.uk/ • Histpop – The Online Historical Reports Collection (OHPR) provides online access to population reports for Britain and Ireland from 1801 to 1937 http://www.histpop.org/ • Look at changes between census questions, structures and geographies
BL Official Publications Collection – Census Reports • UK Census Reports • BL holds statistical reports relating to each census. • Reports for 1921-1991 in the reading room on open shelves • National and county aggregate reports for England and Wales, Scotland, Northern Ireland and Great Britain • Aggregate statistical information at each level for all census questions • Compliments Histpop which has digitised reports between 1801 – 1937 and Casweb: 1971 – 2001 • Some older reports can be found in parliamentary papers
Maps • The library holds a number of maps generated with census and population data from UK and all over the world Augustus Petermann, Map of the British Isles, elucidating the distribution of the population based on the 1841 census. London,1861. Ireland map for railways
Spatial Data Edina Digimap and UK Borders http://edina.ac.uk/digimap/ http://edina.ac.uk/ukborders/ Go Geo! Search http://www.gogeo.ac.uk/cgi-bin/index.cgi
Spatial Data Ordnance Survey Open Data http://www.ordnancesurvey.co.uk/oswebsite/products/os-opendata.html Landmap http://landmap.mimas.ac.uk/
UK Government Open Data • http://data.gov.uk/ • Admin and Statistical data portal • Office for National Statistics • http://www.statistics.gov.uk/default.asp • http://www.neighbourhood.statistics.gov.uk/dissemination/ • https://www.nomisweb.co.uk/Default.asp • National Digital Archive of Datasets • http://www.ndad.nationalarchives.gov.uk/ • Regional • http://data.london.gov.uk/ • http://datagm.org.uk/
International open data • United Nations • http://data.un.org/ • European Union • http://epp.eurostat.ec.europa.eu/portal/page/portal/eurostat/home/ • OECD • http://www.oecd.org/statsportal/ • World Bank • http://data.worldbank.org/ • IMF • http://www.imf.org/external/data.htm • Public Data EU • http://publicdata.eu/
UK Web Archivehttp://www.webarchive.org.uk • Selective Web Archive • over 11,000 websites collected since 2004 • over 50,000 instances • Over 16TB of compressed data • British Library, National Library of Wales, JISC • Also National Library of Scotland, the National Archives, Wellcome Library • Many collaborators • eg Women’s Library, Live Arts Development Agency, Quakers in Britain
UK Web Archive - event-based special collections Collect, preserve, and make accessible eb sites of cultural and scholarly importance from the UK domain
JISC UK Web Domain Dataset (1996-2010) • Funded by JISC to create a research collection of UK websites • Collaboration between the Internet Archive, JISC and the British Library • Copy of subset of the Internet Archive’s web collection that relates to the UK • 470466 files, mostly arc.gz, with 4494 warc.gz. • Total size: 32TB • No local access – possible through the Internet Archive • Can be used to generate secondary datasets and make these available • Analytical access the main route
Other Data and Resources • Guardian Data Store • http://www.guardian.co.uk/data-store • Financial Times • http://www.ft.com/home/uk • Economist Intelligence Unit • http://www.eiu.com/Default.aspx • UK Government Web Archive • http://www.nationalarchives.gov.uk/webarchive/
Other Data and Resources • The Mass Observation Archive • Specialises in material about everyday life in Britain. It contains papers generated by the original Mass Observation social research organisation (1937 to early 1950s), and newer material collected continuously since 1981 • http://www.massobs.org.uk/index.htm • A Vision of Britain through Time • Contains historical Maps, Census Reports, Election reports and other historical material, searchable by local area. • http://www.visionofbritain.org.uk/ • Charles Booth Online Archive • Gives access to archive material from the Booth collections of the London School of Economics and Political Science and the Senate House Library • http://booth.lse.ac.uk/ Images from The Mass Observation Archive
Analysis Tools and Software • Statistical - SPSS, STATA, R (open source) • GIS - ArcGIS, MapInfo, Quantum GIS (open source) • Excel • Online Tools
Examples of Online Analysis Tools • UK Data Service NESSTAR • http://nesstar.esds.ac.uk • ESDS Spatial Tools • http://www.ccsr.ac.uk/esds/gis/ • Economists Online Dataverse • http://dvn.iq.harvard.edu/dvn/dv/NEEO • United Nations • http://data.un.org/Explorer.aspx • London Profiler • http://www.londonprofiler.org/ • London Heat Map • http://www.londonheatmap.org.uk/Mapping/
Online Mapping Tools using Google Maps • MapTube • http://www.maptube.org/ • Google Drive, KML and Google Earth • https://drive.google.com • Gmap Creator • http://www.casa.ucl.ac.uk/software/gmapcreator.asp • Other, more advanced online mapping (requires coding): • Open Layers http://openlayers.org/ • OS Openspacehttp://www.ordnancesurvey.co.uk/oswebsite/web-services/os-openspace/index.html
Large and Big Data • Traditional Tools don’t work! • University Resources? • Cloud Services (Amazon or other) • Coding languages http://www.dominoup.com/
Data Visualization • Presenting data in a useful and interesting manner • Allowing concepts to be easily understood • Lots of examples online e.g: • http://flowingdata.com/ • http://datavisualization.ch/ • http://www.guardian.co.uk/news/datablog
DataCite • DataCite is an international consortium which aims to: • Establish easier access to research data on the Internet • Increase acceptance of research data as legitimate, citable contributions to the scholarly record • Support data archiving that will permit results to be verified and re-purposed for future study • http://datacite.org/
Connecting an Article with the Underlying Data • URLs are not persistent • (e.g. Wren JD: URL decay in MEDLINE- a 4-year follow-up study. Bioinformatics. 2008, Jun 1;24(11):1381-5). Digital Object Identifiers (DOIs) offer a solution • Mostly widely used identifier for scientific articles • Researchers, authors, publishers know how to use them • Put datasets on the same playing field as articles • Dataset • Yancheva et al (2007). Analyses on sediment of Lake Maar. PANGAEA. • doi:10.1594/PANGAEA.587840
Open Researcher and Contributor ID (ORCID) http://about.orcid.org/ • Infrastructure is being created for researchers to build up an open portfolio of research objects
Open Researcher and Contributor ID (ORCID) Register an ORCID ID www.orcid.org and link published papers and data (and anything!) using ORCID’s tools
Sharing Data - Figshare Non published outputs (working papers, datasets) can be deposited in figsharehttp://figshare.com/ given a DataCite DOI and linked back and added to ORCID profile • ODIN wants to expand on this principle and engage with data centres and institutional repositories to allow easier more open discovery of non-traditional research outputs.
Impact of Data • View the impact of your work using traditional citation metrics and social citations http://www.impactstory.org/
Impact and Discovery http://odin-discover.eu/
Depositing and Archiving Data • Why Archive? • Institutional Repositories • UK Data Archive/ESDS • Metadata and Code!
Contact Details John Kaye Lead Curator – Digital Social Science Socials Sciences The British Library 96 Euston Road London NW1 2DB Telephone: 020 7412 7450 Email: john.kaye@bl.uk Twitter: @johnkayebl http://britishlibrary.typepad.co.uk/socialscience/ Slides - http://www.slideshare.net/johnkayebl