170 likes | 311 Views
LSIDs in a Nutshell. Jun Zhao University of Manchester 1 st December, 2005. THE LSID. Any idea?. 30350027. 30350027 gi: 30350027. Outline. What is an LSID Why do we need LSIDs How does it work What are available from your LSID comrades How is it working in my Grid Questions.
E N D
LSIDs in a Nutshell Jun Zhao University of Manchester 1st December, 2005
THE LSID Any idea? • 30350027 • 30350027 • gi:30350027
Outline • What is an LSID • Why do we need LSIDs • How does it work • What are available from your LSID comrades • How is it working in myGrid • Questions
LSID: Life Science Identifier • A URN (Uniform Resource Name) • A standard from the OMG LSR group • A detailed specification: http://www.omg.org/cgi-bin/doc?lifesci/2003-12-02 Clark T., Martin S., Liefeld T. Globally Distributed Object Identification for Biological Knowledgebases Briefings in Bioinformatics 5.1:59-70, March 1, 2004. http://lsid.sourceforge.net/
URN • URI • Uniform Resource Identifiers • Can be further classified as URL & URN • URL: • Uniform Resource Locators • identifying a place where a resource may reside • a representation of a primary access mechanism • URN • required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable. Tim Berners-Lee. Uniform Resource Identifiers (URI): Generic Syntaxhttp://www.ietf.org/rfc/rfc2396.txt
Five part schema • A five-part format: urn:lsid:Authority:Namespace:Object_ID[:Revision-ID] For example: • urn:lsid:ncbi.nlm.nih.gov:pubmed:12571434 refers to a PubMed article • urn:lsid:ncbi.nlm.nig.gov:genbank:T48601:2 refers to the second version of an entry in GenBank
Motivation • Making your local publications globally available • Persistent • Open source • Anyone can become an LSID registration agency • No central third-party registration agency is required, and there are no fees to pay • Linking with other database sources: NCBI protein/nucleotide DBs, PubMed, UniProt/SwissProt, GO terms ……
urn:lsid:www.mygrid.org.uk http://hostname:80/authority http://hostname:80/authority WSDL script Operation calls http, ftp and soap Returned results How does it work LSID Authority Client Data Store Metadata Store
LSID resources • http://lsid.sourceforge.net/ • http://lsid.biopathways.org/ • http://cvs.sourceforge.net/viewcvs.py/lsid/ • Who are using them • BioMOBY(www.biomoby.org) • Aventis • BioImage(www.bioimage.org) • Haystack, the first Semantic Web browser, based on Eclipse (haystack.lcs.mit.edu)
myGrid • An e-Science project for bioinformaticians and biologists http://www.mygrid.org • A set of middleware services • Based on 3 molecular scenarios • A successful workflow workbench Taverna http://taverna.sourceforge.net • Hosting 1,800 bio-services • We finished but we will continue
http://www.mygrid.org.uk/ontology #contains_similar_sequence_to report sequence http://www.mygrid.org.uk/ontology#DNA_sequence LSIDs in myGrid • Motivation • Uniquely and persistently identifying myGrid internal resources • Separating data and metadata • Applying a compatible standard • Integrating with resources in the open world • LSIDs and RDF (Resource Description Framework) urn:lsid:taverna.sf.net:datathing:45fg6 urn:lsid:ncbi.nlm.nih.gov.lsid.biopathways.org:genbank_gi:5851672
Services LSIDs in action Client application 4. Data and metadata retrieved LSID Assigning Service LSIDAuthority LSID Metadata Resolver LSID Data Resolver 2. New LSIDs assigned to data mIR Freefluo Enactor Store plug-in 1. Data sent/ received from services Metadata Store Metadata plug-in Taverna Workbench 3. Data / Metadata stored Workflow design User context
LSID ≠ URL • An LSID is a URN • Identifying a resource by its name, instead of its location • Persistency (theoretically??) • Legacy support • Multiple protocols: http, ftp, file systems, soap…
Your responsibility • Unique authority id • Unique object and revision ids within your namespace • Never reassign an LSID • Persistently identifying your data
What is not working • Security • Access control • LSID synonyms
Questions? Thank you!