1 / 20

Elements of Bioinformatics 14F01 TP5 - Phylogenetic analysis

Elements of Bioinformatics 14F01 TP5 - Phylogenetic analysis. 13 December 2010 - Corrections -. Exercise 1/2. One example tree … PhyML . Tree search : NNIs. Initial tree : BIONJ . Model of amino acids substitution : JTT . Number of taxa : 41 . Log-likelihood : -7592.66297

liseli
Download Presentation

Elements of Bioinformatics 14F01 TP5 - Phylogenetic analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Elements of Bioinformatics 14F01TP5 - Phylogenetic analysis 13 December 2010 - Corrections -

  2. Exercise 1/2 One example tree … PhyML . Tree search : NNIs. Initial tree : BIONJ . Model of amino acids substitution : JTT . Number of taxa : 41 . Log-likelihood : -7592.66297 . Discrete gamma model : Yes - Number of categories : 8 - Gamma shape parameter : 1.764 . Proportion of invariant : 0.061

  3. Exercise 3

  4. Exercise 4 Blast result: vertebrate sequences closest to gem-1 seem to be MIRO2.

  5. Further below (same Blast than previous page), there is a 2nd vertebrate protein group annotated as MIRO1 – again with similarity over the full length of the protein. We copy all the data from the species of interest to infer the phylogenies. Please note that there is no MIRO1 gene for X. tropicalis.

  6. Phylogenetic tree reconstructed using the 1-click mode with default parameter settings. The topology of the result tree corresponds to the species tree with one exception: chicken and frog seem to be monophyletic - this is probably not the case. The result tree further suggests that the vertebrate genes MIRO1 and MIRO2 are paralogs and both are co-orthologs of the fungi gene gem-1 and invertebrate gene MIRO.

  7. Tree robustness Bootstrap replicates are created randomly – therefore results for the same analysis can slightly differ. It is thus more meaningful to perform an analysis with 1000 replicates.

  8. Exercise 5

  9. Dataset 2 Phylogenetic analysis parameters: Alignment: muscle (default); Data model: Gblocks – stringent; Tree-building: PhyML (JTT, gamma(4), shape parameter estimated, invariable sites fixed (0.0)); Test for topology robustness: aLRT (SH). The 1st tree includes species from Archaea and Bacteria, which form the 2 major branches of the tree. The 2nd dataset consists of sequences derived from Eukaryotes. A phylogenetic tree reconstructed with both datasets suggests, that eukaryotic sequences are more closely related to Bacteria than to Archaea. This result is not in agreement with the species tree. An exaplanation could be that Eukaryotes derived this gene from bacteria by horizontal gene transfer (HGT).

  10. Phylogenetic tree of datasets 1, 2 and 3. The sequences of the 3rd dataset consist of eukaryotic homologs. The result suggests that eukaryotic sequences are possibly derived from a single gene transfer event. Sequence annotation indicates that one group of the eukaryotic proteins (‘SYAM’) are mitochondrial. A possible scenario: Eukaryotes could have derived a gene copy by endosymbiontic gene transfer (EGT) from the mitochondrion to the nucleus, followed by a gene duplication event. One of the copies subsequently evolved into the cytoplasmic form. Analysis protokol: Alignment: muscle (default); Data model: Gblocks – stringent; Tree-building: PhyML (JTT, gamma(4), shape parameter estimated, invariable sites fixed (0.0)); Test for topology robustness: aLRT (SH).

  11. Exercise 6 Can you root this tree?

  12. Eukaryota Archaea Eukaryota Bacteria Eukaryota

  13. Eukaryota, nuclear Archaea Eukaryota, mitochondrial, nuclear encoded Bacteria Eukaryota, chloroplast,ARATH: nuclear encoded CYCME: chloropl. encoded

  14. To be improved: • Add more sequence data to the dataset • Use alignments methods such as probcons or T-coffee • Analyze individual branches, e.g. cyanobacteria and chloroplastic data, proteobacteria and mitochondrial-derived sequences (& meaningful outgroups) • ML methods or MrBayes • Calculate support values for internal nodes • Run other tests, which we have not discussed here … • Can this tree be rooted by an outgroup? No, as no species is known that diverged prior to the divergence of the 2 major branches of life!

More Related