240 likes | 363 Views
Digitometric Services for Open Archives Environments. Tim Brody Simon Kampa, Stevan Harnad, Les Carr, Steve Hitchcock {tdb01r,srk,harnad,lac,sh94r}@ecs.soton.ac.uk. University of Southampton, Intelligence, Agents, Multimedia Group. The protocol is openly documented, and metadata
E N D
Digitometric Services forOpen Archives Environments Tim Brody Simon Kampa, Stevan Harnad, Les Carr, Steve Hitchcock {tdb01r,srk,harnad,lac,sh94r}@ecs.soton.ac.uk University of Southampton, Intelligence, Agents, Multimedia Group ECDL 2003, Trondheim, Norway
The protocol is openly documented, and metadata is “exposed” to at least some peer group (note: rights management can still apply!) Archive defined as a “collection of stuff” -- not the archivist’s definition of “archive”. “Repository” used in most OAI documents. Promotinginteroperability Open Archives Initiative ECDL 2003, Trondheim, Norway
OAI Data Model:Resources/Items/Records resource All available (meta)data about the resource Item = OAI identifier item Dublin CoreMetadata MARC Metadata ???XML records record = metadata + identifier + datestamp ECDL 2003, Trondheim, Norway
Protocol Responses ECDL 2003, Trondheim, Norway
Protocol HTTP URL Requests Service Provider Data Provider XML Responses Identify 1 Collection-level Description ListRecords?metadataPrefix=xyz 2 All repository xyz records 3 ListRecords?from=2003-04-02&… All repository xyz records since 2003-04-02 ECDL 2003, Trondheim, Norway
Other Commands • ListIdentifiers • Return only the identifier/datestamp/set membership • ListMetadataFormats • Return the available data formats • ListSets • Return the set structure (if there is one) • GetRecord • Return a record given by OAI identifier ECDL 2003, Trondheim, Norway
Interest in OAI • 111 registered OAI repositories • Many unregistered (e.g. all GNU EPrints.org and DSpace archives) • 4,500,000 public records • http://arc.cs.odu.edu/ • NSDL project, UK’s JISC Information Environment • OLAC (language community built on OAI) ECDL 2003, Trondheim, Norway
Why OAI? • Mandated Dublin Core allows the quick establishment of basic services and tools • Simple and metadata-neutral protocol allows more interesting possibilities (without breaking 1.) and extensions … ECDL 2003, Trondheim, Norway
Adding Caching to OAI-PMH ECDL 2003, Trondheim, Norway
Celestial (OAI Cache) • Developed to maintain a local metadata copy • Avoid repeated, large harvests during development • Provides an abstraction over multiple OAI versions • (hence acts as a gateway to older implementations) • Useful for testing OAI implementations & improving performance • Using XSLT provides a Web interface to OAI • Provides redundancy ECDL 2003, Trondheim, Norway
Citebase Search – Data Model e-Services ECDL 2003, Trondheim, Norway
Content • 250,000 full-text resources • 240,000 of which arXiv.org • 6 million references • 29 mean refs/paper (therefore failed to extract references for 18% of papers) • (n.b. modal refs is 19) • 1 million references linked internally to the full-text (15%) ECDL 2003, Trondheim, Norway
Citebase Search ECDL 2003, Trondheim, Norway
Citebase Search:Navigation by Citation Links Article withreference list Future Referencelink Related Current Article Co-cited Past ECDL 2003, Trondheim, Norway
Citebase Search cites cites ECDL 2003, Trondheim, Norway
Citebase Search cites cites ECDL 2003, Trondheim, Norway
Citebase Search “Co-cited” ECDL 2003, Trondheim, Norway
Read/Cite Cycle ECDL 2003, Trondheim, Norway
Digitometric Services for OAI • Tools for visualising research metadata • Builds an analysis service on Citebase • Knowledge mapping (co-authors, co-citation, etc.) ECDL 2003, Trondheim, Norway
Co-Citation Network ECDL 2003, Trondheim, Norway
Full Co-Citation Map ECDL 2003, Trondheim, Norway
Digitometric Services forOpen Archives Environments • http://www.openarchives.org/ • http://opcit.eprints.org/ • http://citebase.eprints.org/ • http://www.eprints.org/ • http://www.hyphen.info/ • AKT Project (knowledge) Thank you for listening! Tim Brody ECDL 2003, Trondheim, Norway