160 likes | 252 Views
OARJ Project@ #jiscDEPO programme meeting 1 st March 2011 Theo Andrew Project Manager EDINA. Talk outline. Aims Background Discovery Delivery Proof-of-concept Demonstrator service Issues & Next steps.
E N D
OARJ Project@ #jiscDEPO programme meeting 1st March 2011 Theo Andrew Project Manager EDINA
Talk outline • Aims • Background • Discovery • Delivery • Proof-of-concept • Demonstrator service • Issues & Next steps
Aims: assist deposit into multiple existing repository services by developing middleware that will aid both discovery of repository targets and delivery of the content
Background • Depot (2007/09) - unmediated eprints repo • EDINA added a referral service, called Repository Junction, to redirect users to existing IR services. • Survived by the OpenDepot.org service run by EDINA. • OA-RJ (2009/11) – to expand on the concept of the Repository Junction • Initial focus on the discovery aspect; however, • Concept of data mining for target repo identification lead to broker service.
Discovery: The Junction SOURCES openDOAR Matched repositories ROAR Junction db: Org IDs matched to IRs API UKAMF WhoIS Named entity recognition ORCID Known org ID Article XML Funding codes Known IP location Other AMFs INPUTS
The Junction API Suite of three APIs for interacting with the data: /api [primary point of interaction] /cgi/list/ [lists known values - type/content/country/lang/org/net] /cgi/get [used for internal AJAX functions orgs, repos, net] /api can be given a specific locus to deduce repositories (IP address or an ID code) to specify the organisation, or it will deduce a locus based on the calling client. The script can be asked to restrict the returned list by repository type (institutional/learning/..) or accepted content (pre-prints/data/thesis/...) Data is returned in either JSON, Text, or XML formats http://oarepojunction.wordpress.com/junction-api/
Consider a complete bipartite graph between 2 sets, where Set A (=3 nodes) passes information to Set B (= 5 nodes) : Set b Set a Total number of edges = 15 Each data provider needs to broker an agreement with every target repository, and each target repository needs to authenticate each data provider - this does not scale
Consider adding a central node to connect the sets: Set b Set a Set A (=3 nodes) passes information to central node, Central node passes information to Set B (= 5 nodes), number of edges = 8 In this structure, each party maintains just one relationship with a trusted operator
Nodes: • 185 repos listed in openDOAR for UK • 200+ publishers listed in SHERPA Edges: 37,000 or 385 ... what are the Global Figures? Researchers are not confined to the UK borders
Proof-of-concept • http://oarepojunction.wordpress.com/2011/02/25/proof-of-concept-demonstrator/
How a broker model could simplify things:- one consistent deposit process- single sign up for content providers and receivers- building a network of trust Demonstrator service Institutional Repository 1 Broker Institutional Repository 2 Institutional Repository 3
Case study 1: multiple authored paper Journal Y Copy A3 Repository 2 Repository 1 Repository 3
Case study 2: Mandated open access £000s Journal Y Researcher 1 Paper A Copy A1 Copy A1 Researchers 2 & 3
Estimate of the number of broker transferred items during a six month demonstrator service. Data is based upon the number of papers published in journals from the participating NPG portfolio during Jan - June 2010. Data retrieved from PubMed Central and ISI Web of Knowledge. (*Figure rounded down, **Still to be confirmed as a participating institutions).
Issues and dependencies • Common deposit package for SWORD • Missing data – provenance/embargo details/ author affiliations • Licensing – content providers and repos • Institutional sign-up – federation model?
Project Blog:http://oarepojunction.wordpress.com/ Thankyou for listening.Questions?