310 likes | 488 Views
Managing Data in Difficult Times. Repositories Update (UK). Peter Burnhill Director, EDINA National Data Centre, University of Edinburgh, Scotland UK JISC/CNI Conference , Edinburgh, 1 & 2 July2010. Overview. policies/strategies/technologies/infrastructure to manage research/teaching
E N D
Managing Data in Difficult Times Repositories Update (UK) Peter Burnhill Director, EDINA National Data Centre, University of Edinburgh, Scotland UK JISC/CNI Conference, Edinburgh, 1 & 2 July2010
Overview policies/strategies/technologies/infrastructure to manage research/teaching • Scope • Digital repositories at the level of the institution (for itself), at a level above the campus: for institutions, for UK, for much much more • within the European and wider international context • in support of research, learning & teaching …. and management • Having voice as … • a provider of common services and national infrastructure [EDINA] • a user of repository software [Eprints, DSpace, IntraLibrary] • a member of SONEX and indirectly of COAR and UK-CORR • and focus on repository-related progress in the UK since last JISC/CNI; where is the value, how this is assessed/expressed? • Size of investment in recent times • Cost-effectiveness and ‘impact’ of provision • Effort at institutional & inter/national level and the ‘shared services’ agenda? • Wondering what Dorothea said next …
Managing Data in Difficult Times Nostalgia for interesting but not difficult times? • JISC Repositories & Preservation Programme - April 2006; March 2009 “£14m investment in H.E. repository and digital content infrastructure” • This included the JISC RepositoryNet, as four ‘support services’: • Repository Support Project • Repository Research Project • Intute Repository Search • ‘interim repository’ | Prospero | the Depot | OpenDepot • Checking the JISC website today • under the heading of ‘key digital repository activities’ are 21 funding programmes and 216 funded projects. Including some that are just being awarded … & then there is: • OR10: Open Repositories Conference, 6/9 July 2010, Madrid • RepoFringe2010: Repository Fringe 2/3 September, Edinburgh • and several others
R is for Repository • What are Repositories? • Facility/technology to support at least three basic types of service: PUT: a service interface that allows one or more use community to deposit/issue digital content (+ metadata on that content) KEEP: a service that ensures the integrity of that content, for the life of the repository GET: a service interface that allows one or more use community to search/extract that content • Use community: persons or machines/software; appropriate interface • Digital Repositories Review (R.Heery and S.Anderson, 2005) • Digital repository differs from other digital collections in that: • "content is deposited, whether by content creator, owner or third party • architecture manages content as well as metadata; • repository offers a minimum set of basic services [put, get, search, access control] • must be sustainable & trusted, well-supported & well-managed." • "a university-based institutional repository is a set of services … for the management and dissemination of digital materials created by the institution and its community members. … an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access ..." (C. Lynch, 2003)
R is for Repository Who has Repositories and why? 5
R is for Repository Who has Repositories and why? 6
R is for Repository Who has Repositories and why? 7
R is for Repository • What are Repositories and what are they for? • Allowing deposit of and holding all sorts of digital things/stuff • Metadata + Objects; Metadata + pointers; Metadata only • All sorts of objects: images, datasets, theses, articles, etc etc • Special interest in serving our central task: • ease & continuity of access to scholarly resources
Ensuring researchers, students and their teachers have ease and continuing access to online scholarly resources projects ‘continuing’ ‘ease’ accessto content & services post-cancellation usability preservation licence to use restricted anytime/placeconvenience authorisation functionality open well-seamedinteroperability reliability who/WAYFauthentication Use case: article–length work published in e-journalsbut other use cases apply P.Burnhill, Edinburgh 2009
National Data Centres research, learning & teaching in UK universities & colleges acting as platform for network-level services &helping to build the JISC Integrated Information Environment JISC Sub-Committees JISC Collections UK funding councils Research Councils UK
1&2 provider of services & user of software • EDINA-run repositories, with and without JISC • DataShare: for research data (institutional, U of Ed) • Open Data; using DSpace • Jorum: for learning materials [with Mimas] • OER and turnstile (UK); using DSpace & IntraLibrary • OpenDepot (the Depot): for research papers • OA (world); using Eprints • ShareGeo: for geo-spatial data • Open Data and turnstile (UK); using DSpace • OA Repository Junction as shared service tool • using own code and Eprints as an 'escrow' repository during the transfer process. • & maybe others … depending on definition of repository
for learning materials [with Mimas] • OER and turnstile (UK); • using DSpace & IntraLibrary
for research papers • OA (world); using Eprints
ShareGeo: for geo-spatial data • Turnstile (UK) Data & Open Data; using DSpace
3. SONEX • four individuals in JISC-sponsored mini think-tank • from Denmark, Spain & UK • Morgens Sandfaer, Pablo de Castro (Chair) & Jim Downing (Richard Jones) and Peter Burnhill • came out of international workshop Amsterdam, March 2009 • charged with looking at how repositories should inter-operate • the focus group given name of ‘repository handshake’ • 3 other focus groups on citation, identifiers and ‘organisation’ • the latter an exit strategy for EU-funded DRIVER project? • focus switched to ‘deposit opportunities’ • semi-automatic issue/deposit, under terms of Open Access • concern about risk of ‘hollow ring of repositories’ • avoid diktat about standards and techno babble • looking to interoperability via SWORD
3. SONEX • focus switched to ‘deposit opportunities’ • Initial categorisation of repositories into which authors deposit • Looking to onward interworking/interoperability (SWORD) • Not just technical interoperability but workflow • Role of repository managers • But also recognitionof other network-attached ‘systems’: • Authoring tools • Desktop software • Bibliography tools • Non-Author-based workflows • CRIS • REF
SONEX: Scholarly Output Notification & EXchange • Re-branded ourselves as SONEX, to signal … • ‘scholarly output’, not just research publications • ‘notification’ using metadata only • ‘exchange’ as two-way interoperability/negotiation • push metadata; pull content; exploit always-on Internet • SONEX use case: multi-person & multi-institutional • SONEX activities: • Identify/analyse deposit opportunities (use cases) for ingest into the repository space. • Identify/promote projects tackling deposit use cases • Gap analysis • machine (third party systems) as user (PUT & GET) http://sonexworkgroup.blogspot.com
SONEX Use Case Actors • Use case Actor 1: Individual author/researcher [person] author of multi-authored article, other author(s) at other institution(s)sole author with entire career at a single institution [exception] • Variant: author making deposit is the PI of funded research project(compliance with mandate from funder to deposit) • Variant: author making deposit is not the PI of funded research project but work is associated with one or more funded research projects (PI) • Use case Actors 2&3: Depositor is not author (Mediated deposit) • Variant: support staff in research group • Variant : Library’s own resources and document collections • Variant: Institutional Research Support Systems (CRIS systems) [machine] • Use case Actor 4: Repository Manager (RM) of an IR • wishing to be notified & obtain copy from a subject (SR) or another IR • Use case Actor 5: Publisher (which work is published) [machine] • deposit under OA of the author's final copy (OA-RJ & PEER projects) • OA of published copy • Pointer supply to published copy • Other Actors: Vendor of authoring or repository software
SONEX Use Case Scenarios Gven opportunity, and motivation, to deposit content into the ‘repository space’, for onward notification and exchange: • PI(s) as co-author • with felt obligation to notify grant funders of OA deposit • via web-based or desktop environment • Publisher(s) • assisting their author(s) in supply of full-text into appropriate repositories • CRIS, a campus research information system, • managed support for researchers, including note of publications for the Project/Grant • ‘Bibliography’ • web-based publications lists • as maintained by individual researchers, Research Groups, Departments, etc. • including RAE/REF driven institutional actions
OA Repository Junction Project • m2m broker supports: • Discovery of user & content type • Get /ingest package of data (metadata + digital object) • Deduce /parse data object & deduce target repository(s) • Pass /deposit package into repository targets • Notify /send alert to appropriate 3rd party(s)eg repository managers • Working with ‘Publisher’ and ‘Subject Repository via Broker Service • Theo Andrew & Ian Stuart (EDINA)
O is for Open • OA (for publications) not the only ‘open’ policy: • OER: Open Educational Resources • UKOER: Jorum and other subject/institutional repositories • Open CourseWare – as open webpages • Open Data • Both repository and open databases; Linked Open Data • Open Source Software • Open Access • the regime used for Subject Repositories • seemed to be motive for creation of Institutional Repositories • ‘Green OA’ self-archiving by authors: Creative Commons • Is this how we should judge success of Repositories? • OA now becoming mainstream, including uptake by publishers • "One fifth of 2008's research papers now open access" The Great Beyond, Nature blog, June 25, 2010 • Are Repositories the only way to support OA? • Repositories to align themselves with, and support funder-mandates for open access if they are to be successful
Informal discussion with JISC programme managers “Dealing with institutional processes now, rather than repository technology. Depending on type of content, the projects would fit much more closely in: • managing research data programme • research information programme • open educational resources programme as they have much more in common with those projects than they do with each other.”
Informal discussion with JISC programme managers “Dealing with institutional processes now, rather than repository technology. Depending on type of content, the projects would fit much more closely in: • managing research data programme • research information programme • open educational resources programme as they have much more in common with those projects than they do with each other.” • “repositories have found their core business proposition via the REF and making sure Universities list research outputs to obtain research ratings • - have not succeeded in making the business case that IRs should be doing the job of archiving, a core library platform, or the job of an institutional demonstrator/poster space. • Repositories fit in the ‘University Enterprise Stack’ by virtue of being a system that delivers a business solution to a real financial problem.”
UK-CORR: UK Council of Research Repositories individual rather than institutional, UKCORR-discussion@jiscmail.ac.ukUK has ‘rich heterogeneous repository landscape’ (C.Awre); lurk following comment from Dorothea Salo
UK-CORR: UK Council of Research Repositories individual rather than institutional, UKCORR-discussion@jiscmail.ac.ukUK has ‘rich heterogeneous repository landscape’ (C.Awre); lurk following comment from Dorothea Salo: US mainly about OA full texts; UK mainly about … serving research assessment! • Is there more to IRs than the REF: lots of bibliographic records & little full text? • Should IRs only accept full text, not metadata only? • in absence of a CRIS, our IR had to do REF (Lancaster & Northampton) • was OA but then RAE2008, but should aim to include all (OU) • motive for IR was digital preservation, with different REF system; funder mandate compliance for OA; visibility via OA (Oxford/Bodleian) • RAE/REF is opportunity to engage institution-wide (Warwick) • Advent of CRIS (which don’t manage outputs well) may be opportunity for IRs to have role, including use of ‘metadata only’ as lever to obtain full text (Hull) • REF & research management information allows IRs to be embedded as platform for OA (Southampton) • RAE/REF has different goals to OA and IRs with low % of full text may undermine OA movement (Nottingham)
COAR: Confederation of Open Access Repositories • New: 1st General Assembly in Madrid in March 2010 • 48 members drawn largely from Europe, but including both JISC & CNI, and also EDINA (University of Edinburgh) • Work Plan for 2010/12, including • Advocacy on behalf of OA and repositories (Rs) [both together?] • Populating (OA) Rs • Best practice documents • Facilitate and ensure data interoperability of (across?) Rs • interoperability with other systems (such as CRIS systems) • Support national helpdesks • Guidance on how Rs will form essential elements for global e-infrastructure • Promote R manager profession • Provide advice & guidance on suitable R infrastructure technologies • Global (meta)data store • Strategic partner other infrastructure-related initiatives worldwide
Managing Data in Difficult [Interesting] Times End of an era? End of the R word? Embedded in domain-specific processes? • Moving from technology to policy & practice:some domain-specific, some common to repositories • Collection management: active curation & Linked relationships • versions, data|article|learning material • Collections, ‘see also’ • First point of public issue (availability); Take-down regimes • Institutional stewardship responsibility for its born-digital [and digitised] content • "a university-based institutional repository [supports] a set of services … for the management and dissemination of digital materials created by the institution and its community members. … an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access ..." (C. Lynch, 2003) • What of the (new) shared services imperative? • Who does what, at what level/scale?
Theoretical basis for digital library? • Mix of document tradition & computation tradition “considerable simplification, … helpful to think … of two traditions, or mentalities, even cultures, co-exist in area of Information Science • “Approaches based on a concern with documents, with signifying records: archives, bibliography, documentation, librarianship, records management, and the like • “approaches based on uses for formal techniques, whether mechanical (such as punch cards and data-processing equipment) or mathematical (as in algorithmic procedures).” Michael Buckland, UC Berkeley, 1998 http://people.ischool.berkeley.edu/~buckland/asis62.html
Time for me to stop Hoping that I have left some space/place for questions • Thank you Acknowledgements Theo Andrew, Pablo de Castro & Robin Rice, Dave Flanders &Andy McGregor
Multimedia resources: candidate for repository? • platform for search and download of film, video and audio • wide range of subject coverage, including documentary film • Llicensed for use in learning, teaching and research • Being re-worked as the Digital Media Hub, combining • Film & Sound Online • initial 600 hours of film, digitised for downloading • NewsFilm Online • 3000 hours of material from ITN & Reuters • Over 4TBs of clips to download • Release of product from JISC Digitisation programmes • Plus Education Image Gallery of still photography • Visual and Sound Materials Portal project • Discovering all sorts of audio-visual material • Special interest for social science as record on non-print record of 20th Century: the first A-V century • With new forms of research material to use and to master