220 likes | 302 Views
Adding Value to Data and Information: Moving towards a Science Commons? Dr Liz Lyon Director, UKOLN Science Commons Workshop, Brussels, September 2006. UKOLN is supported by:. This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0. Scholarship today? OA landscape.
E N D
Adding Value to Data and Information: Moving towards a Science Commons? Dr Liz Lyon Director, UKOLN Science Commons Workshop, Brussels, September 2006. UKOLN is supported by: This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0
http://www.flickr.com/photos/dmclean/239158788/in/photostream/http://www.flickr.com/photos/dmclean/239158788/in/photostream/ 15 September 2006 Architecture of Participation?
Reference datasets as infrastructure? Data-centric 2020 vision
(Very simple) e-Research Cycle (New) knowledge extraction: data mining, modelling, analysis, synthesis Formulate hypothesis / ideas, test, experiment, observe: data creation, collection & capture Data processing Data processing Data processing Data management storage & validation: description, deposit, self-archiving, preservation, certification e-Infrastructure Open access Collaboration Adding value: Data linking, annotation, visualisation, simulation Data processing Data processing Scholarly communications: data disclosure, publication, citation, discovery, re-use This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0
Understanding the research process: workflows • UK JISC-funded activity • Project StORe: Source-to-Output Repositories (Edinburgh) • RepoMMan: Repository Metadata and Management (Hull) • Primary data : research publications • Survey questionnaire, activity diagrams e-Scientist desktop? Slide: Carole Goble
RAW DATA DERIVED DATA RESULTS DATA Deposit scenario (…part of….) • Produce strategy for synthesis (=idea) • Submit plan to SmartTea system (incl. identifiers) • Retrieve and follow instructions (sub-workflow?) • Experimental synthesis metadata automatically recorded on instruments (Smart Lab) • Create record for synthesised sample (+ proposed chemical identifier) in R4L laboratory data management system • Run spectral analyses on sample capturing further analysis metadata (incl. time-stamp, analysis software version, researcher details etc.) • Save spectrum in native and common formats • Invoke R4L data capture service and deposit files + metadata in laboratory repository…
The R4L Repository Create new compound Add experiment data and metadata Deposit Search / Browse Slide: Simon Coles
http://www.ukoln.ac.uk/projects/ebank-uk/ eBank UK Project • Promoting open access data in an institutional repository • Adding value through linking from data to derived publication • Embedding data service in learning workflows: pedagogy • UKOLN (lead), University of Southampton, University of Manchester
Data creation & capture in “Smart lab” Presentation services: portals Data discovery, linking, citation Search, harvest Data analysis, transformation, mining, modelling Aggregator services Harvest Deposit e-Research workflows Institutional data repositories Laboratory repository e-Crystals Federation model Deposit Validation Validation Publication (Chemistry Central) Data curation & preservation: databases & databanks Linking, citation Publishers: peer-review journals, conference proceedings This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0
Digital repositories, OA & preservation • Long-term access: trust, responsibility, policy • Trusted DR Audit Checklist for Certification Draft Research Libraries Group-NARA Taskforce 2005 • Self-certification: DINI-Zertifikat • UK Digital Curation Centre: advice, tools & services • RepInfo Registry • EU CASPAR Integrated Project • Task Force on the Permanent Access to the Records of Science http://www.dcc.ac.uk/ http://www.casparpreserves.info/pages/1/index.htm http://tfpa.kb.nl/
Data, metadata and interdisciplinary discovery • Validation, publication & discovery of data models & schema • Metadata packaging standards • METS, MPEG 21 DIDL • Complex object model? • Semantic descriptions • Formal high-level and domain ontologies • ePrints DC Application Profile http://www.ukoln.ac.uk/repositories/digirep/index/Eprints_Application_Profile • eBank Application Profile crystallography data http://www.ukoln.ac.uk/projects/ebank-uk/schemas/ • UK Intute IR search service (eprints) • Informal social network approaches “folksonomies”
Persistent identifiers for data citation • How will they be used? We need use cases: depositor, author, service provider, researcher, publisher? • Schemes: DOI, Handle, ARK, PURL • Publication & citation of scientific primary data project National Library for Science & Technology (TIB), University of Hanover, Germany. STD-DOI Project DOI registry for datasets http://www.std-doi.de • eBank exemplar • DOIs from TIB http://dx.doi.org/10 .1594/ecrystals.chem.soton.ac.uk/145 • Data citation policy http://ecrystals.chem.soton.ac.uk/rights.html
Discovering data: • Domain identifier: International Chemical Identifier (INChI) code • Google molecule using INChI • Slide from Simon Coles Coles, S.J., Day, N.E., Murray-Rust, P., Rzepa, H.S., Zhang, Y., Org. Biomol. Chem., 2005, (10),1832-1834. DOI: 10.1039/b502828k
Adding value: repository services • Tools: for deposit, normalisation, manipulation, transformation….. • Linking, annotation, visualisation • Aggregators: generic, (sub-) disciplinary • Knowledge extraction: • Mining (data, text, structures) • Modelling (economic, climate, mathematical, biological…) • Analysis (statistical, lexical, gene….)
Linking research to learning - embedding eBank aggregator service in a science portal for student learners • MChem course • Assess role in Undergraduate Chemical Informatics courses • Pedagogic evaluation • Report to be published.
NaCTeM http://www.nactem.ac.uk/ Emerging tools: TerMine, GENIA, Cafetiere Nature 23 March 2006 OTMI: Open Text Mining Interface
Avian flu outbreaks mashup - Nature January 2006 Data from FAO, WHO… +Google Earth
Thank you. UKOLN receives core funding from the Joint Information Systems Committee (JISC) and the Museums, Libraries & Archives Council (MLA) and is based at the University of Bath, UK.