180 likes | 304 Views
WWW2006 repositories workshop. Linking research papers and research data: possibilities for a generic solution. The identification of workflows, norms and perceived problems in the use of source and output repositories common attributes across disciplines
E N D
WWW2006 repositories workshop Linking research papers and research data: possibilities for a generic solution
The identification of workflows, norms and perceived problems in the use of source and output repositories common attributes across disciplines A generic technical specification for functional enhancements to source and output repositories, identified from a survey of active researchers A pilot system that demonstrates the linking of holdings in a source repository (the UK Data Archive) to research papers stored in output repositories The ability more conclusively to track the use and influence of one's published research A structured means of surveying research publications and their associated source data across an entire discipline or within a specific research theme An environment with added value: output repositories that link to their sources and source repositories that link to their outputs will expand the opportunities for dissemination of research and scholarship. Outputs & benefits WWW2006RepoWorkshop
Constituency Seven scientific disciplines surveyed: • Archaeology, Astronomy, Biochemistry, Biosciences, Chemistry, Physics, Social Policy • Academic researchers (staff & PGs), independent researchers, government • 3,700 e-mail invitations despatched • 377 online respondents (10%) WWW2006RepoWorkshop
Response rates WWW2006RepoWorkshop
Endorsement - 1 WWW2006RepoWorkshop
Endorsement - 2 WWW2006RepoWorkshop
Who uses source repositories? WWW2006RepoWorkshop
Frequency of submission WWW2006RepoWorkshop
Archaeology Astronomy (40% of total ‘Other’) Biosciences English Heritage, Portable Antiquities Database NED (NASA-IPAC Extragalactic database), CDS (Centre de Donnees Stellaires), ADS (Harvard), SIMBAD,NASA data archives, Advanced Camera for Surveys Science Archive, Very Large Array Archive, European Southern Observatory Archive, etc. etc. BioMagResBank Who uses the Other source repositories and what are they? WWW2006RepoWorkshop
Source data formats WWW2006RepoWorkshop
The 76 significant others? +latex+.cc source code, .cif (crystallographic data), .pdb, .mtz, .pool, .root, .raw, .swf, .fla, .raw, .mpg, binary files, chemdraw cdx, xwin nmr files, .ps files, .fla, .swf, masslynx files, mathematica, derived data in PAw-format ntuples, raw mass spectrometry data, Kanga, X-ray diffraction data, kaleidagraphs, Atlas/ti hermeneutic unit files, C++/shell scripts, Fourier induction decay files, spectra, TeX source (math), etc., etc., etc., etc……….. WWW2006RepoWorkshop
Who assigns metadata? WWW2006RepoWorkshop
Archaeology refer in some cases to use of thesauri, Dublin Core, etc. WWW2006RepoWorkshop
Key metadata WWW2006RepoWorkshop
TheOthermetadata Some examples: • Archaeological period, artefact material, artefact type, conservation method • Celestial object, position and observation date • Chemical entity, chemical identifier (InChI) • Description of the instrument operating mode • Description of GIS processes applied, min/max co-ordinates, cell resolution for raster data • Description of experimental conditions under which the data was generated • Experimental method used • Protein sequence WWW2006RepoWorkshop
Output repositories WWW2006RepoWorkshop
Evolving strategy These early indications from the StORe questionnaire confirm a strategy in which • the pilot middleware could provide a broad core generic solution; • the middleware must be capable of accepting a limited number of discipline-specific add-ons; • a standard platform for metadata can be established to reflect a large proportion of practices and needs. In addition, further analysis is determining that • cross-discipline data requirements must be met for output and source data; • a range of different attitudes to data sharing will have to be supported by effective validation if repositories are to be accepted and effective; • improved online support is expected to be the most appropriate and economical means of meeting expectations for help; • there are indications of a considerable lack of awareness of repositories amongst academic staff and postgraduates. WWW2006RepoWorkshop
Questions? WWW2006RepoWorkshop