190 likes | 338 Views
Erpanet Symposium on Persistent Identifiers PURLs. Stuart Weibel Senior Research Scientist June 17, 2004. What do we want from Identifiers. Authority Reliability Appropriate Functionality (resolution and other services) Persistence – throughout the life cycle of the information object
E N D
Erpanet Symposium on Persistent IdentifiersPURLs Stuart Weibel Senior Research Scientist June 17, 2004
What do we want from Identifiers • Authority • Reliability • Appropriate Functionality (resolution and other services) • Persistence – throughout the life cycle of the information object • What are the business models to support identifiers? • Not just a matter of money, but costs are part of the equation
PURL: Persistent Uniform Resource Locators • PURLs look like URLs… they ARE URLs • PURLs emerged from OCLC’s participation in the IETF URN activity • A tool for managing names and namespaces since 1996
The 404 Problem • Resources disappear… • Some are actually gone • Disk reorganizations take place • Changes in responsibility for resources occur… bought, sold, abandoned, removed • URLs serve double duty as names and locators • Making URLs symbolic names will improve their usefulness
What is a PURL? • PURL: Persistent Uniform Resource Locators • They look like URLs… they ARE URLs • No new technology, no new protocols • A toolset for managing names and namespaces
How PURLs Work • PURLs take advantage of inherent redirection facility in the HTTP protocol • PURLs provide an additional level of indirection that maps a symbolic identifier to a network location • PURLs work without plug-ins or other special code in browsers… they are ‘just’ URLs • No New Technology added… a feature, not a bug
What does Persistentmean? • Not a guarantee of perpetual access • Not a magic solution to the 404 problem • Not persistence of resources, but rather of the names • PURLs are a toolset that can be used to manage resource names and locations with greater reliability
Persistence derives from… • The social or contractual commitments of organizations responsible for managing information resources. • Technology can help, but the problem is, at its heart, a social one.
Logical Components of a PURL http://purl.oclc.org/OCLC/PURL/FAQ protocolresolver asset name address
PURL Server as a Redirection Server http GET Client PURL Server http redirect http GET Resource Server resource
PURL Server as a Resource Server http GET Client PURL Server resource
Do I have to run my own PURL Server? • OCLC’s PURL Server is open to all, including the ability to request domains • As of Monday, May 24, 2004 : • PURLs Created = 571 427 • PURLs Resolved = 86 010 679 • Unique Client Systems = 5 763 071 • The PURL server software is available at the purl.org site for anyone to download and use without cost or restriction.
PURLs and The Identifier Layer Cake Social Business Policy Technology Functionality The Web: http…TCP/IP…
Functional Layer: Operational characteristics of Identifiers • Is it globally unique? No problem – it’s a URL • Matching persistence with the need? • Organizational commitment • Can a given identifier be reassigned? No • Is it resolvable? Yes: To that which is assigned by the registrant • How does it ‘behave’? Exactly like a URL, but managed • Is the ‘name’ portion of the identifier opaque, or can it carry ‘semantics’? Determined by the registrant • Do humans need to read and transcribe them? Probably • Do identifiers need to be matched to the characteristics of the assets they identify? Determined by the registrant
PURL Technical Layer • What dependencies are assumed? • http • What is the nature of the system • Open Source, public domain • Are servers centralized? federated? peer to peer? • Distributed and stand-alone, but could be federated (see POIs, as an example) • How is uniqueness assured? • Inherent in the character of URLs
PURL Policy Layer • Who has the ‘right’ to assign or distribute Identifiers? • Anyone can register without cost • Who has the ‘right’ to resolve them or offer services? • Unspecified • What are appropriate assets? • Determined by the registrant • Can identifiers be recycled? • No • Can ID-Asset bindings be changed? • Yes, at the discretion of the registrant • Is there supporting metadata? • No intrinsic PURL metadata • Is there a governance model? • What you do in the privacy of your own PURL server is your own business
PURL Business model layer • Who pays the cost? • PURL.ORG is maintained by OCLC as a free service • Anyone can run their own PURL server (and pay for it) • How, and how much? • Negligible costs • Who decides? • The server host • The problem with identifier business models… • Those who accrue the value are often not the same as those who bear the costs. Libraries are in the business, however, of aggregating costs and making them look free. • You can’t collect revenue on resolution
PURL Social Layer: Who do you trust? • Who do you trust? • Governments? • Cultural heritage institutions? • Commercial entities? • Non-profit consortia? • It depends on the context, the service, and the motivations for the service.
In Summary • PURLs offer a methodology and tool set for managing resource names and namespaces • Neither PURLs nor any other technology are a replacement for policies or commitments to manage resource names • PURLs represent a community-based solution founded in freely available, widely deployed technology. http://purl.org