1 / 5

Genes to Trees Daniel Ayres and Adam Bazinet

This project, titled "Genes to Trees," focuses on reconstructing phylogenetic trees by collecting GenBank data, performing phylogenetic analysis using tools like PAUP, MrBayes, and GARLI, and curating data through multiple sequence alignment techniques like ClustalW, Muscle, and MAFFT. The workflow involves user inputs of sequences and taxonomic constraints, eliminating smaller groups, creating a super-matrix, and conducting phylogenetic analysis to generate trees of closely related organisms. Feasibility lies in scripting using Perl, leveraging BioPerl libraries, accessing sequence data, manipulating alignments, and enhancing bioinformatics programming capabilities. The relevance of this project includes facilitating further analyses, running multiple parallel analyses, employing a modular workflow, and advancing robust high-throughput phylogenetics.

Download Presentation

Genes to Trees Daniel Ayres and Adam Bazinet

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genes to Trees Daniel Ayres and Adam Bazinet CMSC858P - Project 2 Proposal

  2. Phylogenetic tree reconstruction “Genes to Trees” GenBank Data collection Phylogenetic analysis (PAUP, MrBayes, GARLI) Data curation Multiple sequence alignment (ClustalW, Muscle, MAFFT) Visual inspection and post-processing

  3. How does it work? • User inputs: • Set of DNA or amino acid sequences • Taxonomic constraints • Homologous sequences obtained from GenBank • Smaller groups eliminated • Multiple alignment of each group made • Uninformative columns removed • “Super-matrix” of all sequences created • Phylogenetics analysis performed • Output: • Phylogenetic tree of closely related organisms Workflow

  4. Is it feasible? • Scripting will be done with Perl • Extensive use of BioPerl libraries • Collection of modules for bioinformatics programming • Accessing sequence data from local and remote databases • Manipulating individual sequences • Searching for similar sequences • Creating and manipulating sequence alignments

  5. Why is this relevant? • Results can serve as a starting point for further analysis • Multiple analyses can be run in parallel • Workflow is modular • A step towards robust, high-throughput phylogenetics

More Related