1 / 42

Concepts and Tools Needed to Increase Bottom-Up Taxonomic Expert Participation

Concepts and Tools Needed to Increase Bottom-Up Taxonomic Expert Participation in a Global Names-Based Infrastructure. Nico Franz, David Patterson, Sudhir Kumar & Edward Gilbert School of Life Sciences, Arizona State University TDWD 2013 Annual Conference, Florence, Italy

carney
Download Presentation

Concepts and Tools Needed to Increase Bottom-Up Taxonomic Expert Participation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Concepts and Tools Needed to Increase Bottom-Up Taxonomic Expert Participation in a Global Names-Based Infrastructure Nico Franz, David Patterson, Sudhir Kumar & Edward Gilbert School of Life Sciences, Arizona State University TDWD 2013 Annual Conference, Florence, Italy Developing a Names-Based Architecture for Linking Biodiversity Data October 31, 2013 Slides @ http://taxonbytes.org/tdwg-2013-concepts-and-tools-needed-for-taxonomic-expert-participation-in-a-global-names-based-infrastructure/

  2. Arizona State University's current GN involvement http://globalnames.fulton.asu.edu http://www.globalnames.org/ Biodiversity Informatics @ ASU http://taxonbytes.org/informatics http://pinkava.asu.edu/starcentral/custar/portal.php

  3. Arizona State University's current GN involvement http://globalnames.fulton.asu.edu Concept/proposal of a GN Taxonomic Clearing House (TCH) http://www.globalnames.org/ Biodiversity Informatics @ ASU http://taxonbytes.org/informatics http://pinkava.asu.edu/starcentral/custar/portal.php

  4. Two motivating quotes – 1. Counteract disenfranchisement • "My belief is that the taxonomic community feels *disenfranchised* – in various ways – and we MUST change that, in a tangible manner. [The Commissioners] do whatever we can to interact with the broader community […] to help demystify the Code and improve the perception of the Commission." • "My own personal vision is far more than that, however: I am convinced that we have a culture of taxonomists many of whom do not understand the Code,or outright oppose it (or parts thereof, such as gender agreement), and that the BEST way to get them to care about the Code is to give them an actual voice. In effect, we need to deputize them – offer a role in which every taxonomist is given a measure of authority, of control." • "Not replacing ALL of the functions and duties of the Commission, but redesigning the process so each and every taxonomist has a direct, personal stake in the enterprise (to the extent that they choose to exercise that privilege)." • – ICZN Commissioner, 2013 (to D. Patterson) •  TCH concept supposes that one can: • Replace Code with major aggregator project or perspective (such as CoL), and • Replace Commission withproject leadership, • and retain a sense of truth. Hence – empower individual experts.

  5. Two motivating quotes – 2. Build for the taxonomic process • "There is a shared awareness among taxonomists that "outside communities" would like usable, precise classifications to apply to their research challenges. However this reasonable demand is not the same as asking for a single, semi-arbitrarily flattened view that does not actually represent the underlying complexities." • "Many taxonomy users are aware that their current system in use is ephemeral. There are valid pressures to improve long-term data integration, and *that* is what many users will value over having a single system." • "Mandating a single view should never work as something that can fairly represent and attract taxonomic research and progress. […] These are in my view worthwhile challenges that address the demands for representing taxonomic discourse and progress as well as serving the user communities with better integrated and less ephemeral products." • – NMF, Aug. 2013, on Taxacom ("Global Species Lists and Taxonomy" thread) •  TCH concept includes a taxonomic editing layer ("GNITE") that supports: • Multiple, partial, alternative classifications and phylogenies (a.k.a. "the process"); • Concepts, relationships, and visualizations of given/inferred concept provenance. • Hence – prepare for concept-level semantics, services.

  6. Two hurdles to a GN concept-level platform1 • "What is a concept? Nobody really understands this." • "What about concept inflation? This is not scalable."  A way to address: promote semantic, social practices that minimize pitfalls. DOI:10.1080/14772000.2013.806371 (link) 1 Not exhaustive, or even very fair to people and projects who have dealt with these "hurdles" and have overcome them.

  7. What is a concept? – shallow, technical • Name /Authority works as a most context-neutral • (or -vacuous) definition. • Practical situations facilitate different inference abilities once context is given. Source: http://code.google.com/p/darwin-sw/

  8. Deeper issues – why bother? • "The soundest motivation for using taxonomic concepts in biology is not merely to improve data management (Berendsohn, 1995) but to increase the semantic precision of taxonomic names (Franz et al., 2008)." • "We suggest that this approach should be pursued if and where the (not inconsiderable) cost of doing so is offset by yielding better integration of taxonomically labeled biological information, and therefore better biological inferences." • – Franz & Cardona-Duque, 2013

  9. Think: intended ability to contribute to SW-type reasoning • "Whenever a name appears in subsequent paragraphs, we transparently signal either: • (1) that this usage refers to a single and specificprevious or current concept of that name (sec.); or  Perelleschus O'Brien & Wibmersec. Franz & O'Brien 2001

  10. Think: intended ability to contribute to SW-type reasoning • "Whenever a name appears in subsequent paragraphs, we transparently signal either: • (1) that this usage refers to a single and specificprevious or current concept of that name (sec.); or • (2) that it refers more vaguely to the cumulative history of concepts associated with that name (no additional labeling); or  Perelleschus O'Brien & Wibmersec. Franz & O'Brien 2001  Perelleschus O'Brien & Wibmer

  11. Think: intended ability to contribute to SW-type reasoning • "Whenever a name appears in subsequent paragraphs, we transparently signal either: • (1) that this usage refers to a single and specificprevious or current concept of that name (sec.); or • (2) that it refers more vaguely to the cumulative history of concepts associated with that name (no additional labeling); or • (3) that we utilize this name in an even more non-committal sense (non-focal), typically as a semantic crutch to help contextualize names whose meanings we actually intend to focus on.  Perelleschus O'Brien & Wibmersec. Franz & O'Brien 2001  Perelleschus O'Brien & Wibmer  Ganglionus O'Brien & Wibmer[non-focal]

  12. Think: intended ability to contribute to SW-type reasoning • "Whenever a name appears in subsequent paragraphs, we transparently signal either: • (1) that this usage refers to a single and specificprevious or current concept of that name (sec.); or • (2) that it refers more vaguely to the cumulative history of concepts associated with that name (no additional labeling); or • (3) that we utilize this name in an even more non-committal sense (non-focal), typically as a semantic crutch to help contextualize names whose meanings we actually intend to focus on. • By consistently specifying the nomenclatural and/or taxonomic context in which names are used (or the inverse), and what expectations towards our readership are implied, we are a step closer to achieving a machine-interpretable annotation of these usages. • – Franz & Cardona-Duque, 2013  Perelleschus O'Brien & Wibmersec. Franz & O'Brien 2001  Perelleschus O'Brien & Wibmer  Ganglionus O'Brien & Wibmer [non-focal]

  13. What are speakers expecting from their (machine, KRR) audience? • "Whenever a name appears in subsequent paragraphs, we transparently signal either: • (1) that this usage refers to a single and specificprevious or current concept of that name (sec.); or • (2) that it refers more vaguely to the cumulative history of concepts associated with that name (no additional labeling); or • (3) that we utilize this name in an even more non-committal sense (non-focal), typically as a semantic crutch to help contextualize names whose meanings we actually intend to focus on. • By consistently specifying the nomenclatural and/or taxonomic context in which names are used (or the inverse), and what expectations towards our readership are implied, we are a step closer to achieving a machine-interpretable annotation of these usages. • – Franz & Cardona-Duque, 2013  Perelleschus O'Brien & Wibmersec. Franz & O'Brien 2001 Heavy duty semantic reasoning, precise  Perelleschus O'Brien & Wibmer Some reasoning, gets worse as time increases  Ganglionus O'Brien & Wibmer[non-focal] More limited to no reasoning expectation

  14. Putting concepts, names, [non-focal] to use in a new classification

  15. With conventions in place, we can compartmentalize & innovate • Perelleschus (2013) revision combines name/concept taxonomy organically (2) Name (1) Concept (3) Non-focal • Concept taxonomy "cuts through" any separation of classification vs. phylogeny; though outgroups may be viewed as [non-focal].

  16. Consistency – maximize concepts when possible, minimize names New species, diagnosis Key to species-level concepts, old & new names Figure showing specimens, traits Distribution map

  17. Consistency – maximize concepts when possible, minimize names New species, diagnosis Key to species-level concepts, old & new names Names are essentially restricted to Introduction/Discussion, i.e. when the entire taxonomic history related to a name is referred to. As an expert aware of context at all times, I can almost omit them. (not so with [non-focal] cases which are needed). Figure showing specimens, traits Distribution map

  18. Concepts for ranked Linnaean names, focal & non-focal clades Phylogenetic characters, concepts for clades Phylogenetic character matrix Phylogenetic tree

  19. Historically endorsed concepts are readily flagged as such • Revision includes complete circumscriptions for 54 related concepts, 1936-2013

  20. Represent all pertinent prior & current classifications & phylogenies 1936 1954  = "carludovicae" (name), cumulative history   1986 2001  2006 2013   

  21. Reasoning over concept evolution

  22. Get ready for taxonomic KRR, I: identifying individual concepts • NamePerelleschus contributes to 5 concepts;sec. 1954, 1986, 2001, 2006, 2013

  23. Get ready for taxonomic KRR, II: assemble classifications (P/C)

  24. Get ready for taxonomic KRR, III: express concept articulations • Articulations use Franz & Peet (2009)1 terms which significantly improve upon TDWG-TCS 1 Franz & Peet. 2009. Towards a language for mapping relationships among taxonomic concepts. Systematics and Biodiversity 7: 5-20. Link

  25. Concept resolution and merge taxonomy visualization via Euler/X 2013 higher-level concepts 2001 higher-level concepts 2013/2001 species concepts • Euler/X uses ASP reasoning, RCC • Reads in 3 concept tables • Logic / consistency check • Inconsistency explanation • Provence, repair options • Max. inform. relations (mir) • Merge taxonomy visualization • More in SfB – Formal Models Euler project URL: https://sites.google.com/site/eulerdi/home

  26. Interim conclusions – concepts provide valuable TCH services • The core semantics and prototype tools are in place to: • Handle both novel nomenclatural and taxonomic/phylogenetic datavia small (to large), incremental expert submissions to a suitable TCH.

  27. Interim conclusions – concepts provide valuable TCH services • The core semantics and prototype tools are in place to: • Handle both novel nomenclatural and taxonomic/phylogenetic data via small (to large), incremental expert submissions to a suitable TCH. • Concepts allow the new submissions of taxonomic effort and progress to: • Be identifiedas such (as are their individual authors). • By delimited from imprecise, or existing information. • Be semantically represented (parent/child hierarchies). • Be logically integrated with relevant previous concepts (Euler/X). • Be visualized in merge taxonomies that resolve provenance.

  28. Interim conclusions – concepts provide valuable TCH services • The core semantics and prototype tools are in place to: • Handle both novel nomenclatural and taxonomic/phylogenetic data via small (to large), incremental expert submissions to a suitable TCH. • Concepts allow the new submissions of taxonomic effort and progress to: • Be identified as such (as are their individual authors). • By delimited from imprecise, or existing information. • Be semantically represented (parent/child hierarchies). • Be logically integrated with relevant previous concepts (Euler/X). • Be visualized in merge taxonomies that resolve provenance. • Jointly these services are needed to (1) counter disenfranchisement, (2) build for the taxonomic process, and (3) facilitate better inferences in biology.

  29. How might this work in a TCH?

  30. Focus new development on the GN Interface for Taxonomic Editing • Prototyped for GN1 (U.S.) by Mozzherin, Patterson & Shorthouse at MBL. • In need of adding new functionality, interoperability, user community.  Upgrades to a native GN taxonomy editing layer are just one part of a grander TCH infrastructure and service package.

  31. TCH infrastructure

  32. TCH infrastructure  "Run" by experts, individually, in groups.

  33. TCH infrastructure  Taxonomists, phylogeneticists work within "native" platforms.

  34. TCH infrastructure  Strategy: initial establishment with select expert communities.

  35.  Capitalizing on existing, diversified GN1 infrastructure and services. TCH infrastructure

  36. TCH infrastructure  Expand GNITE into 3 powerful layers for single classification assembly, nomen-clatural editing, and concept taxonomy

  37. TCH infrastructure  Build a FP "Lite" system to track all TCH submissions, edits, assignments of authorship, track expert credit profiles

  38. TCH infrastructure  GN "Union" = endorsed classification, although multiple alternatives are an essential part of TCH output service; "intelligent alerts" notify experts

  39. Conclusions – unless we build for the process, products will suffer • "TCH embodies the view that improving existing classification repositories and services is very much a matter of improving their ability to accommodate the systematic research and publication process."

  40. Conclusions – unless we build for the process, products will suffer • "TCH embodies the view that improving existing classification repositories and services is very much a matter of improving their ability to accommodate the systematic research and publication process." • "It is not just a matter of gathering more classifications into static structures with limited options for expert access, editing, and classification provenance tracking."

  41. Conclusions – unless we build for the process, products will suffer • "TCH embodies the view that improving existing classification repositories and services is very much a matter of improving their ability to accommodate the systematic research and publication process." • "It is not just a matter of gathering more classifications into static structures with limited options for expert access, editing, and classification provenance tracking." • "Rather, it is about bottom-up collaboration that allows merger, critical input, refinement, and due recognition of, and respect for, a diversity of views that will lead to evolving authoritative taxonomic compilations."

  42. Acknowledgments • TDWG 2013 Symposiumorganizers – Yde de Jong & Richard L. Pyle • GN1 team – DmitryMozzherin, Richard Pyle, David Shorthouse, Robert Whitton • Euler team, UC Davis – BertramLudäscher, MingminChen, ShizhuoYu, ShawnBowers • Juliana Cardona-Duque – Universidad de Antioquia, Medellín, Colombia • Steven Baskauf (concept/occurrencegraph) – VanderbiltUniversity https://sols.asu.edu http://taxonbytes.org

More Related