290 likes | 393 Views
Excerpts from: Studies on the Identification, Tracking and Monitoring of Genetic Resources George M. Garrity Michigan State University East Lansing, MI USA Lorraine Thompson, USA David Ussery, Denmark Norman Paskin, UK Dwight Baker, USA Philippe Desmethe, Belgium David Schindel, USA
E N D
Excerpts from: Studies on the Identification, Tracking and Monitoring of Genetic Resources George M. Garrity Michigan State University East Lansing, MI USA Lorraine Thompson, USA David Ussery, Denmark Norman Paskin, UK Dwight Baker, USA Philippe Desmethe, Belgium David Schindel, USA Perry Ong, Philippines Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
The study To review of recent methods of identifying genetic resources directly base on DNA sequences To identify methods of tracking and monitoring genetic resources through the use of persistent globally unique identifiers, including practicality, feasibility, costs, and benefits of different options. Our charge Our approach A design exercise to help develop baseline requirements for such a global tracking system to aid users and providers in complying with CBD ABS objectives. Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Key questions and definitions How are genetic resources defined? Are genetic resources different from biological resources? Is the concept universal? What metadata are associated with genetic resources and how is that metadata defined and supplied? What are the ramifications of new genomic methods on identifying genetic resources? What are persistent identifiers? Which persistent identifiers are used widely? How do persistent identifiers differ? Have tracking systems for genetic resources been deployed elsewhere? What existing knowledge and technologies can be “leveraged” in creating such a system? Questions Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
2 2 2 2 2 2 2 2 2 2 1995 2005 2010 Key events in parallel to the CBD timeline 1980 1985 1990 2000 UNECD Rio de Janeiro COP VII COP IX COP I COP III COP V AHWG Biodiversity CBD enters into force COP II COP IV COP VI COP VIII PMC OAI requirement PubMed Central Name server UN goes online Internet viruses & worms Internet VoIP YouTube TCP/IP DNS WWW Netscape Google/ Napster blogs begin PubMed PMC OAI compliant URN UK PMC LIMS PURL ARK Human Genome Initiative NIH data directive INSCD Created HGP begins M. jannaschii genome C. elegans genome 100,000 16S rDNA Human genome finished 750,000 16S rDNA PCR GenBank RDP Created H. influenza genome Draft human genome CBOL begins Genetic Finger- printing E. coli genome NGS NNGS ABI Sequence US Supreme Court Decision BLAST 1000 bacterial genomes M. tuberculosis genome Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
An example of a tracking system Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Cascading workflow Processed to yield multiple derivative samples Most yield small numbers of samples that are discarded. Others can yield 100-1000s, some of which may be retained for decades. Each sample follows a predictable path through the system. Derivative samples may be stored and reprocessed in the future for a variety or purposes. Each sample is associated with one or more unique identifiers.* Each sample is associated with various types of metadata Sample source, testing history, contractual rights and obligations, etc. Entire sample history can be reconstructed “on-the-fly” if identifiers are actionable All genetic resource types Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Examples of tracking identifiers Centrally controlled numbering scheme, but semantically laden General properties B325,797-001-001 Sequential series sample number lead number batch number Microorganisms Screening number Culture collection number (internal/external) 0534-605F assay.series year/week XX-YYYYz collection identifier sequential number Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Lessons learned Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Impact of genome sequencing Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Service (2006, 311: 1544) in Science. Reproduced with permission. Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
The “surprises” keep coming... Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Impact of 2nd and 3rd generation sequencing From: Gupta, PK, Trends in Biotechnology, (2008) 22: 602-611 Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Cumulative number of published genomes Source - Liolios et al. Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
A job unfinished Source - D. Ussery Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
A job undone is still worth something... Source - P. Chain Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
“Although used every day, identifiers are a mystery to many people, including people responsible for building complex information systems.”Report of the NISO Identifiers Roundtable, 2006 Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Essential elements in Human - human communications Human - machine communications Machine - machine communications Identifiers Ideally… Exist as an unambiguous string Context and application dependent Actionable Resolvable Other points to consider Semantically opaque Global or local Unique or non-unique Unanticipated uses Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
A name or an identifier for a resource that uniquely identifies that resource and will be forever associated with that resource. It will never be reassigned to any other resource and will not change regardless of where the resource is located or whatever protocol is used to access it. Use of a well managed persistent identifier rather than a location will ensure that when a document is moved, or its ownership changes, the links to it will remain actionable. Persistent identifiers From: Diana Dack, Persistence is a Virtue Information Online Conference, Sydney. January 2001 Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
PID The concept of name resolution URL PID1 PID2 PID3 URL1 URL2 URL3 Locates Identifies Name resolution Resource Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
User PID Key metadata Global registry URL PID1 PID2 PID3 URL1 URL2 URL3 Locates Identifies Resource Metadata PID URL Name registration & Name resolution Authority Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
The identifier gradient A single unambiguous string A label that identifies an entity ISBN 0-387-98771-1 ATCC 27126 A numbering scheme A method of providing consistent syntax to denote class membership of an entity. An arbitrary internal system A formal standard or industry convention Key point is establishing a 1:1 correspondence between labels and members Enumeration The numbers or labels are simply strings Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
The identifier gradient A syntax by which an identifier can be expressed in a form suitable for use within a specific infrastructure. Actionable identifiers URI (URN and URL) ISBN numbers as UPC/EAN identifiers Does not mandate a method of creating labels Does not create a managed environment An infrastructure specification Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
The identifier gradient A fully implemented identifier system Includes Unique identifiers A formalized infrastructure Management policies for registration, structured interoperable metadata, policy, and governance mechanisms. Examples UPC/EAN barcodes and RFID tags Digital object identifiers (digital identifiers of objects) Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Life Science Identifiers Persistent URLs Syntax of some other PIDs in “common” use LSID <Handle>::=<Handle Prefix> "/"<Handle Suffix> http://hdl.handle.net/10.1099/ijs.0.64483-0 <purl>::=<protocol>/<resolver>/<name> http://purl.oclc.org/OCLC/OCLC/PURL/FAQ urn:<LSID>:<AuthorityID>:<Namespace>:<Object>:<Rev> http://lsid.biopathways.org/resolver/data/urn:LSID:ncbi.nlm.nih.gov:GenBank/accession:NT_001063:2 Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Two implementations using DOIs Independent membership association,founded and directed by STM publishers. Mission is to connect users to primary research literature through a DOI RA that performs reference cross-linking, subject to publisher-access controls. The largest and most successful implementation of DOI services. NamesforLife is an experimental semantic resolution service for dynamic terminologies. It provides a method for persistently linking the occurrence of a biological name or other technical term in third party content to managed information about its origins, formal definition, current usage, and related information. Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
On-the-fly look-up services using DOIs Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
A proposed tracking system Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity
Our recommendations • Promptly establish the minimum information required for compliance with the IR Stipulate which documents are mandatory and which are optional. • Adopt a well-developed and widely used PID system that leverages an existing infrastructure and derives support from multiple sources. • Consider current and future needs of genetic resource providers and users. Biological and functional diversity and both must be accommodated. • Deploy light-weight applications that use browser technology for interactive use. Publish application program interfaces to support other web services. Develop strong policies governing access and use of the resource to avoid data abuse. Trust is a key element. • Deploy prototype tracking systems to validate underlying concepts and refine critical elements that will be needed in a fully operational system. Ad hoc Open-Ended Working Group on Access and Benefit Sharing Seventh Meeting, Paris, 2-8 April 2009 Convention on Biological Diversity