240 likes | 255 Views
Dive into the world of persistent identifiers as experts debate assignment stages, system management, version control, and international views on assigning identifiers across various repositories. Discover the trade-offs between semantic and opaque identifiers, the challenges of non-online repositories, and the importance of defining identifier structures. Learn best practices and explore the impact of workflow integration on assignment processes.
E N D
DCC Workshop on Persistent IdentifiersA Layered Model Decision Making Concerning Persistent Identifiers Stuart Weibel Senior Research Scientist July 1, 2005
Open Univ • How do you establish that you need a persistent identifier (as opposed to a non-persistent one) • E.g. OU is creating a new course, with a very complex collection of courseware, and an equally complex workflow which manages all the parts • At what stage do we assign persistent identifiers to all the bits? • [SW:] Is this a granularity issue? • In part, but there’s ownership issues
CETIS • Need to be careful about advice, and who we give it to • Where’s the line between what you do and don’t need to care about • At what point does persistence become important • [PB: ‘publish’ is the crucial word here]
LSE • One or many systems? • Digitising a photo archive, use a separate system for that? • Who assigns identifiers at the individual school level • Managing identifier incrementation • What about managing versions in a repository? • Separate identifiers for distinct manifestations of the same article?
Nat Lib Scot • Anyone with experience of Nat Lib of Aus scheme?
Strathclyde Univ CDLR • Confession: I assign identifiers • And I’ve changed them • And changed their URLs • But, I’ve left redirects • Is this ok?
Edin Univ Library • How relate identifiers for a physical object to digital manifestations of that object • Preexisting vs. internal institutional identifiers • Public vs. private • How implement identifiers within a METS document • [Enders: Not a METS problem, an XML problem – SICI is not an XML ID]
Edin Univ Lib • If http URIs are a given, should they be opaque (e.g. ISBNs) or semantic (e.g. http://www.w3.org/REC-xml/) • “Do you know your own barcode?” • Semantic • Human readable (within a designated community, broad (native speakers of English) to narrow (well-trained biomed librarians) • vs • Rule-based, so decodable given some additional info • Opaque • E.g. ARK, random string, with (by design) no structure • E.g. LibCon(?) concatenation of year and accession number • … • What am I trading off wrt to one choice or another? • Are http: URIs intrinsically semantic?
Enders • If we’ve assigned identifiers, in a non-online repository, how can I get a persistent identifier? • E.g. DSPACE • [AD: DSPACE uses Handle] • [Cetis: Fedora uses Handle (?)] • [Strathclyde: Eprint.org does not offer any]
CETIS • Authoring and content tools should help out here • [SW: Where in workflow should identifiers be assigned?]
EDINA • To what extent should we have an international non-territorial view of assigning identifiers? • Or should we stay within the bounds of the nation state?
Glasgow Univ Lib • If from my repository I want to refer to some other repository’s identifier, in a persistent way • Can I do so responsibly independently of how they’ve done this • [JK: If they don’t publish a name they take responsibility for, don’t go there!] • Image archive example – you can email a filename and get a temporary URI (ftp:) back – how can I write down into a formal workflow a reference to “BigBear file name so-and-so” • So it’s a permanent identifier for a static digital resource, but I can’t refer to it formally • [RR: instance of general case of offline resources] • [SW: They need to be pressured to recognise the value of e.g. prepending “http:…”] • But they won’t do that today, so I’m going to do it, but that feels like breaking the rules • [??: What about names for resources in GRID computing?] • [??: One option is Handles, recently released module for Globus] • [??: Surely allocating persistent http: URIs would be overkill with 5M large images] • [SW: Publishing the naming structure is distinct from delivering all of them instantly] • [tutti: design-on-the-fly]
SunCat • Working on a catalogue of serials etc., heard about FRBR, etc., but inadequate for my needs, do I use the same identification scheme up and down the hierarchy • [AP: important to distinguish between application-level questions and in-principle general questions] • [PB: Application-level are not ipso facto out of scope] • And what’s an expression? • [SW: Out of scope]
UKOLN • Candidate functional requirements for persistent identifiers • Compare for equality • Retrieve from it • Do we get more specific, e.g. ARK? • Ding an sich [JK: or ‘suitable surrogate’] • [HST: Demurs] • Description • Cover the case where the identifier no longer identifies anything? • Change history • [PB: There’s a wealth of relevant experience] • Persistence commitment • [SW: Demur – no always desirable, sometimes not desirable] • Encode and distribute
NLScotland • Where would we go in ISO to do this • [PB: There is an ISO technical c’ttee…] • [RR: Wrt NISO, I can feed back on where do we go: Standards, recommendations, Best Practices notes]
Open Univ • Can we clarify what’s been done wrt Premis data dictionary? • Just published a new dictionary • [JK: Wrt identifiers, says nothing, judged too controversial/out of scope]
DCC • Is there a relationship between the identifier and the findability of a resource? • Given pervasive use of search engine • [SW, PB: What are the roles in discover and location of service]
British Library • If we do use http: URIs for identifiers, does the HTTP protocol provide any particular actions which make sense, e.g. PUT or POST? • Should resolution be a determiner of existence • [SW: redirection comes in here] • In theory, any 10/13 digit number can be an ISBN • What does ‘404 not found’ mean?
Strathclyde • Impact of search engines – people aren’t using identifiers any more • Should we question to assumption that identifiers are the entry point into access to resources
DCC/Hunter • What’s the impact of the fact that the community is split into two parts • Those with detailed knowledge of the identity system • Those who just have requirements which identity contributes to solving • Bookshop example • [PB: Ironmonger example]
Discussion candidates • Semantic vs. opaque • What should libraries do? • Is ‘retrievable’ a necessary functional requirement? • What are we identifying? • What is the connection between the identifier and the thing itself?
Seamus/DCC • Can we do automatic assignment of identifiers to extracts from (multiple) databases • [RR: About attaching metadata as you go] • As an author, a private person, how do I get an identifier for my work • Overwriting is OK, always? • Historical printing example feeds into thinking about multiple versions in the digital world • [PB: Relevant term is ‘provenance’ and the role of identifiers]
UKOLN • Are the functional requirements on identifiers across domains or classes of resources? • Conceptual • Digital • Different domains? • Physical • Work and instance… FRBR-like… Guardian… NLM persistence statements
EDINA • Is the distinction between persistence for ‘record’ and persistence for ‘reuse’ relevant to the design of identifiers? • Archival vs. access? • [SW: Difference in behaviour based on difference in object identified] • [Seamus: BNFL vs. BBC example]