1 / 21

IDs in and out of the database

IDs in and out of the database. Entomological Collections Network (ECN) 2012 November 10 – 11, Knoxville, TN Debbie Paul, Greg Riccardi. Overview. What good is identification? How are identifiers used by consumers Providing IDs Resolving IDs in a server

isabel
Download Presentation

IDs in and out of the database

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IDs in and out of the database Entomological Collections Network (ECN) 2012 November 10 – 11, Knoxville, TN Debbie Paul, Greg Riccardi

  2. Overview • What good is identification? • How are identifiers used by consumers • Providing IDs • Resolving IDs in a server • Strategies for storing IDs in databases • Linked Data • Annotations ~ all sorts • Feedback

  3. What good is identification? • Aggregation • If you get info from 2 sources that are about the same object, you can combine the info • Resolution (finding information about object) • Types of resolution • Determine where to get information • Determine how to get information • Providing information • How to create IDs • How to publish IDs • How to fetch database information for IDs

  4. HTTP URIs • Biggest problem • Identification and 2 types of resolution are comingled • Resolution • Where to get information • Look somewhere • How to get information • Fetch information using some protocol

  5. DOI example • The DOI is • 10.3897/zookeys.209.3135 • URI (for aggregating) is • doi:10.3897/zookeys.209.3135 • A URL for information retrieval (proxy resolution) is • http://dx.doi.org/10.3897/zookeys.209.3135 • Information fetched from • HTML: • http://www.pensoft.net/journals/zookeys/article/3135/abstract/five-task-clusters-that-enable-efficient-and-effective-digitization-of-biological-collections • RDF: • http://data.crossref.org/10.3897/zookeys.209.3135

  6. What’s in an ID? • For consumer: • NOTHING! No information • Might as well be UUID • Can’t type it, remember it, parse it, resolve it • Useful for comparison and aggregation • Equal strings (persistence) • Different strings about the same object • fetching information • Send the ID somewhere for info

  7. What’s in an ID? • For Provider/resolver: • Use ID to find local storage of information • E.g. • parse out the DWC triple • Extract the database table and primary key • Look up the ID in a table of IDs • Look up ID in a URI field of a database table

  8. What’s in an id for the provider? • record id 112234 • uuid954c8760-e1a6-4b4b-ab82-6bf7311c25f3 • lsidurn:lsid:example.org:specimen:22545 • an http - uri • ezidhttp://n2t.net/ark:/99999/fk42b9hdf • doidoi:10.1038/ng0609-637

  9. What about Specimen identifiers? • identifier on the specimen? • readable text • encoded data • barcode is a contextual identifier • identifier in the database? • http://ids.usms.edu/herb/0014097 • http://ids.usms.edu/herb/0303134303937

  10. How do providers identify? • Notice online databases and your database and find the identifiers of the various objects • Some identifiers are local (e.g. primary key) • Some identifiers are globally unique • Some identifiers are URIs

  11. Identification in the field • wireless or workbench • data collected and uploaded

  12. Storing IDs in databases • your contextual ids?, your guids? • What to use for IDs? • record id • uuid • lsid • uri • what’s in your wallet database? • Morphbank Example

  13. IDs in Morphbank • Morphbank Example • http://www.morphbank.net/818505

  14. IDs in Morphbank • Morphbank Example • http://www.morphbank.net/643261

  15. Sharing data with IDs • into a publication • uploaded to the web • data shared with a database integrator / aggregator • GBIF • iDigBio • VertNet • Morphbank • what is it exactly in the publication? • an id?, a guid? a link to more information? • what will be cited? searched for?

  16. Feedback with IDs • Annotations • Target of annotation • http://www.morphbank.net/818505 • filtered PUSH • linked data ~ the semantic web • (benefits – in a minute) • updating the database • be(a)ware • Remember previous IDs

  17. What’s coming up next? • expect guids for all sorts of objects • collection objects (example: specimen) • georeferences • taxon concepts • determinations • people

  18. GUIDs are key • 1 to many IDs known for a given object • store and share the ones you know about Specimen RecordID 19537 Specimen Previous Catalog Number 212345 Specimen Catalog Number / bar code bbbrc000123 Darwin Core Triplet (DwC) flmnh:herb:bbbrc000123 DwC Occurrence URI urn:catalog:flmnh:herb:bbbrc000123 Specimen GUID of type lsidurn:lsid:biocol.org:flmnh:bbbrc000123 Specimen Opaque Identifier (UUID) 424854d7-baec-42cf-a142-805b64117b9f URI for UUIDurn:uuid:424854d7-baec-42cf-a142-805b64117b9f Specimen GUID of type HTTP-URIhttp://ids.flmnh.ufl.edu/herb/bbbrc000123 *Cannot enforce single identifier per object

  19. caring for guids • store them • database adjustments • tweaking current standard practices • share them • data standards • 3 ways to modify darwin core • reap the benefits

  20. caring for guids – reap the benefits • Data quality feedback • Dialog based on annotation • Tracking objects through analysis and use • Maintaining attribution to provider • Find related objects • Find a way to take advantage of efforts of many smart dedicated people • BHL, biscicol, filtered PUSH, GNA, TNRS, SGR,…

  21. Thanks from iDigBio 42!

More Related