180 likes | 271 Views
http://phylotastic.or g /. A project of the NESCent HIP (hackathons, interoperability, phyogenies) working group. Making the Tree of Life Accessible for Research.
E N D
http://phylotastic.org/ A project of the NESCent HIP (hackathons, interoperability, phyogenies) working group. Making the Tree of Life Accessible for Research This is a 20-minute overview with links to screencasts and demos, providing an introduction to the project and to the upcoming 2nd hackathon (Jan 28 to Feb 1, 2013, Tucson, AZ).
RE-USE OF TREES “Most attempts at re-use seem to end in disappointment” [1] Consumers (re-users) Producers Repositories [1] Stoltzfus, et al., 2012, “Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis”, http://www.ncbi.nlm.nih.gov/sites/entrez/23088596
USE CASE: LEAF VEIN EVOLUTION ? aextoxicaceae/aextoxicon/aextoxicon_puntatum anacardiaceae/anacardium/anacardium_excelum anacardiaceae/rhus/rhus_glabra annonaceae/dugetia/dugetia_furfuraceae . . . R.L. Walls with Linnaeus Input list from Walls, 2011 Phylomatic 98-species tree of Walls, 2011 APG framework with 1566 taxa
Some big trees * • 4,500 mammal species • 55,473 angiosperm species • 1,827 angiosperm taxa • 800 fish families • 16,000 taxa in ToLWeb • 73,060 eukaryotic species • 400,000 prokaryotic 16S rDNAs • 250,000 species NCBI taxonomy • And other trees not listed THE “TREE OF LIFE” = * Proper phylogenies as well as phylogeny-based taxonomic hierarchies
ARCHITECTURE OVERVIEW Species1 Species2 Species3 condition1 condition2 Phylotastic
Phy· lo· tas· tic /fī lō ˈtăs tĭk/ • Adjective: providing computable, convenient and credible access to expert knowledge of the phylogeny of species • Noun: an open-source project of HIP* to prototype and disseminate a distributed, web-services-based phylotastic system • Synonyms: ToL-o-matic • Web home: http://www.phylotastic.org PHYLOTASTIC * Hackathons, Interoperability, Phylogenies, a NESCent working group
Teams: • TNRS - taxonomic name resolution • TreeStore - triple store with REST API • Architecture - controllers, interfaces, pruners • Branch lengths - scaling trees using chronograms • Shiny - other demos and cool front-end stuff HACKATHON #1, JUNE 4 TO 8 @ NESCENT • 30 participants • high diversity • 2 remote sites
PHYLOTASTIC.ORG It’s all open source Screencasts & live demonstrations
YouTube video at http://bit.ly/U1VGA1 (3 min) • Web form invokes URL API, like this: • http://phylotastic-wg.nescent.org/script/phylotastic.cgi?species=Felis+silvestris%2C+Canis+lupus%2C+Cavia+porcellus&tree=mammals&format=newick • So, you can run it with curl • Or with a simple Perl script: SCREENCAST: SCRIPTABLE PRUNER, WEB FORM Rutger Vos #!/usr/bin/perl –w my $base = "http://phylotastic-wg.nescent.org/script/phylotastic.cgi"; my ( $tree, $taxa ) = @ARGV; $taxa =~ s/[ _]/+/g; $taxa =~ s/,/%2C/g; system( "curl \"$base?species=$taxa&tree=$tree&format=newick\" > out.tre; open out.tre" ); exit;
YouTube screencast at http://bit.ly/QjymbK (3 min) • Installable Mesquite module is here: • https://github.com/phylotastic/mesquite-o-tastic SCREENCAST: MESQUITE-O-TASTIC Peter Midford NESCent Arlin Stoltzfus NIST
Reconcile-tree problem • Very common use-case • Inputs are gene tree, species tree • Gene tree: easy to get • Species tree: hard to get • Approach (see Reconciliotastic demo at http://www.phylotastic.org/demos) • Load gene tree (with NCBI identifiers embedded in labels) • Compute species list • Extract identifiers from labels • Map IDs to species sources via NCBI web service • Get species tree phylotastically • Reconcile gene tree and species tree using Zmasek’s SDI library RECONCILIOTASTIC
ROLE OF TNRS IN PHYLOTASTIC (BRIEF) <1 minute Phylotastic 36 species + 2 extras* auto-extract species names from text 5 minutes Copy & paste species named in Table 1 Phylotastic Riek, 2011 (Mammalian Biology 76(1):3-11) 33 species 12 minutes Manually key in species list from tree image Phylotastic 36 species Hours or days Manually reconcile names with names in source tree Phylotastic 40 species 40 species * named in text but not used in phylogenetic analysis
screencast: http://bit.ly/T5ikoG(7 min) • Riek, 2011 (case study) • Cool demo: • PDF auto-extracted names tree • What Taxonomic Name Resolvers do • What the phylotastic TNRS team did • Using the Taxosaurus URL API • http://api.phylotastic.org/tnrs/submit?query=Cephalophus+monticola ROLE OF TNRS — MORE DETAIL TNRS team Naim Matasci iPlant Gaurav Vaidya U. Colorado Siavash Mirarab UT Austin
THE OTHER KIND OF DATING WITH FOSSILS r8s, pathd8, Multidivtime Calibrating a tree using fossil timepoints
11 studies >4,000 trees 6,973 taxa 620,868 leaves DATELIFE http://www.datelife.org DateLife engine (R, FastRWeb, Rserve) PHP
CURRENT STATUS - WYSIWYG Branches could shift without warning There are some holes You might crash We haven’t put the pieces together yet The interfaces are unstable
Phylotastic hackathon #2 (Jan 2013, AZ) • Themes • Integration – get components to work together • Use-cases – give users what they want • More Shiny Stuff — make it look good • Your idea here • To apply • http://tinyurl.com/PhyloTastic2 • More partners & sponsors WHAT’S NEXT?
www.phylotastic.org Send feedback to Arlin Stoltzfus (arlin@umd.ed) ACKNOWLEDGEMENTS HIP Leadership Team Participants Sponsors