130 likes | 238 Views
Evolutionary Simulation. 9:10-10:00 Tracey Heath (UT-Austin) and Sheng Guo (UPenn) “Million-Taxon Macroevolution Simulation for Phylogenetic Algorithm Benchmarks” 10:00-10:50 Paul Higgs (McMaster) “Mitochondrial Genomes—Evolution at Three Scales” 11:10-12:00 Claus Wilke (UT-Austin)
E N D
Evolutionary Simulation 9:10-10:00Tracey Heath (UT-Austin) and Sheng Guo (UPenn) “Million-Taxon Macroevolution Simulation for Phylogenetic Algorithm Benchmarks” 10:00-10:50Paul Higgs (McMaster) “Mitochondrial Genomes—Evolution at Three Scales” 11:10-12:00 Claus Wilke (UT-Austin) “What are the Determinants of Protein Evolutionary Rates in Yeast?” 1:30-2:20Christina Burch (UNC-Chapel Hill) “Sexual Reproduction Selects for Robustness and Negative Epistasis in Artificial Gene Networks” 2:20-3:10Laura Landweber (Princeton) "Scrambled genes as a model for the origin of complexity" 3:30-4:20John Yin (UW-Madison) “Evolution on a Budget: Insights from Simulations of Virus Growth” 4:20-5:10Carlo Maley (Wistar) “Multi-scale Modeling of Evolution in Tumors”
Major Goal: Develop the Computational Infrastructure for Reconstructing Phylogeny at the Level of the Entire Tree of Life • Funded by 5-year Information Technology Research (ITR) grant from NSF • Community Effort of Over 40 Biologists, Computer Scientists, Mathematicians from 17 Institutions in the US and 6 Abroad
Simulation, Modeling, and Benchmarks U Penn: Junhyong Kim, Sampath Kannan, Susan Davidson Yifeng Zheng, Steve Fisher, Sheng Guo, Lisan Wang, Shirley Cohen U Texas : David Hillis, Lauren Meyers Eric Miller, Tracy Heath, Derrick Zwickl NC State: Spencer Muse Errol Strain Yale: Paul Turner and Bernard Moret Tandy Warnow Robert Jensen Randy Linder
Goal: Develop validated datasets of sufficient complexity and scale to realistically benchmark latest tree algorithms
Problems • Typical simulation scheme requires combinatorial exploration of parameter space resulting in an experimental design that is extremely difficult to manage • Large-scale simulations is computationally demanding • Branching structure specification is critical but the options become limited for very large trees • Credible simulation model acceptable to the community is difficult to establish
Simulation Design • Generate a very large dataset (>106 positions) over a very large tree (>106 taxa) using a suite of complex models of evolution • Store the data in a database • Retrieve subsets of the data by various sampling schemes
Simulation and Data Access Model Characterization Simulators • Character Evolution Simulators • HyPhy • Micro-evolution • Others Taxon Sampling Database • Tree Topology Simulators • Pure Birth • Birth-Death • Empirical Fit • Others Data Subset with Associated Subtree Model Sampling • Others • Tree/Char Combined • Experimental Evolution • Virtual Cell • etc Format Translators PAUP*, etc
“This isn’t real…” Rationale: Generate Community Support for Credible Phylogenetic Benchmarks Learn from Experiences of the Broad Community Outside of Systematics
Tracy Heath Sheng Guo Paul Higgs Claus Wilke Laura Landweber Christina Burch John Yin Carlo Maley Sean Dalton Penn Center for Bioinformatics CIPRES members NSF Thanks to…