1 / 38

"Nothing in biology makes sense except in the light of evolution"

Theodosius Dobzhansky. "Nothing in biology makes sense except in the light of evolution". Homology. bird wing. bat wing. human arm. by Bob Friedman. homology vs analogy. A priori sequences could be similar due to convergent evolution.

cwray
Download Presentation

"Nothing in biology makes sense except in the light of evolution"

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Theodosius Dobzhansky "Nothing in biology makes sense except in the light of evolution"

  2. Homology bird wing bat wing human arm by Bob Friedman

  3. homology vs analogy A priori sequences could be similar due to convergent evolution Homology (shared ancestry) versusAnalogy (convergent evolution) bird wing butterfly wing bat wing fly wing

  4. Related proteins Present day proteins evolved through substitution and selection from ancestral proteins. Related proteins have similar sequence AND similar structure AND similar function. • In the above mantra "similar function" can refer to: • identical function, • similar function, e.g.: • identical reactions catalyzed in different organisms; or • same catalytic mechanism but different substrate (malic and lactic acid dehydrogenases); • similar subunits and domains that are brought together through a (hypothetical) process called domain shuffling, e.g. nucleotide binding domains in hexokinse, myosin, HSP70, and ATPsynthases.

  5. homology Two sequences are homologous, if there existed an ancestral molecule in the past that is ancestral to both of the sequences Homology is a "yes" or "no" character (don't know is also possible). Either sequences (or characters share ancestry or they don't (like pregnancy). Molecular biologist often use homology as synonymous with similarity of percent identity. One often reads: sequence A and B are 70% homologous. To an evolutionary biologist this sounds as wrong as 70% pregnant. Types of Homology Orthology: bifurcation in molecular tree reflects speciationParalogy: bifurcation in molecular tree reflects gene duplication

  6. If two (complex) sequences show significant similarity in their primary sequence, they have shared ancestry, and probably similar function.(although some proteins acquired radically new functional assignments, lysozyme -> lense crystalline). Sequence Similarity vs Homology The following is based on observation and not on an a priori truth:

  7. The Size of Protein Sequence Space(back of the envelope calculation) Consider a protein of 600 amino acids. Assume that for every position there could be any of the twenty possible amino acid. Then the total number of possibilities is 20 choices for the first position times 20 for the second position times 20 to the third .... = 20 to the 600 = 4*10780 different proteins possible with lengths of 600 amino acids. For comparison the universe contains only about 1089 protons and has an age of about 5*1017 seconds or 5*1029 picoseconds. If every proton in the universe were a super computer that explored one possible protein sequence per picosecond, we only would have explored 5*10118 sequences, i.e. a negligible fraction of the possible sequences with length 600 (one in about 10662).

  8. PROTEINS WITH THE SAME OR SIMILAR FUNCTION DO NOT ALWAYS SHOW SIGNIFICANT SEQUENCE SIMILARITYfor one of two reasons: a)  they evolved independently (e.g. different types of nucleotide binding sites); or b)   they underwent so many substitution events that there is no readily detectable similarity remaining. no similarity vs no homology If two (complex) sequences show significant similarity in their primary sequence, they have shared ancestry, and probably similar function.THE REVERSE IS NOT TRUE: Corollary: PROTEINS WITH SHARED ANCESTRY DO NOT ALWAYS SHOW SIGNIFICANT SIMILARITY.

  9. Two sequences are homologous, if there existed an ancestral molecule in the past that is ancestral to both of the sequences homology Types of Homology Orthologs: “deepest” bifurcation in molecular tree reflects speciation. These are the molecules people interested in the taxonomic classification of organisms want to study. Paralogs: “deepest” bifurcation in molecular tree reflects gene duplication. The study of paralogs and their distribution in genomes provides clues on the way genomes evolved. Gen and genome duplication have emerged as the most important pathway to molecular innovation, including the evolution of developmental pathways. Xenologs: gene was obtained by organism through horizontal transfer. The classic example for Xenologs are antibiotic resistance genes, but the history of many other molecules also fits into this category: inteins, selfsplicing introns, transposable elements, ion pumps, other transporters, Synologs: genes ended up in one organism through fusion of lineages. The paradigm are genes that were transferred into the eukaryotic cell together with the endosymbionts that evolved into mitochondria and plastids (the -logs are often spelled with "ue" like in orthologues) see Fitch's article in TIG 2000 for more discussion.

  10. Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space. Each additional sequence position adds another dimension, doubling the diagram for the shorter sequence. Shown is the progression from a single sequence position (line) to a tetramer (hypercube). A four (or twenty) letter code can be accommodated either through allowing four (or twenty) values for each dimension (Rechenberg 1973; Casari et al. 1995), or through additional dimensions (Eigen and Winkler-Oswatitsch 1992). Eigen, M. and R. Winkler-Oswatitsch (1992). Steps Towards Life: A Perspective on Evolution. Oxford; New York, Oxford University Press. Eigen, M., R. Winkler-Oswatitsch and A. Dress (1988). "Statistical geometry in sequence space: a method of quantitative comparative sequence analysis." Proc Natl Acad Sci U S A85(16): 5913-7 Casari, G., C. Sander and A. Valencia (1995). "A method to predict functional residues in proteins." Nat Struct Biol2(2): 171-8 Rechenberg, I. (1973). Evolutionsstrategie; Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Stuttgart-Bad Cannstatt, Frommann-Holzboog.

  11. Diversion: From Multidimensional Sequence Space to Fractals

  12. one symbol -> 1D coordinate of dimension = pattern length

  13. Two symbols -> Dimension = length of pattern length 1 = 1D:

  14. Two symbols -> Dimension = length of pattern length 2 = 2D: dimensions correspond to position For each dimension two possibiities Note:Here is a possible bifurcation: a larger alphabet could be represented as more choices along the axis of position!

  15. Two symbols -> Dimension = length of pattern length 3 = 3D:

  16. Two symbols -> Dimension = length of pattern length 4 = 4D: aka Hypercube

  17. Two symbols -> Dimension = length of pattern

  18. Three Symbols (the other fork)

  19. Four Symbols: I.e.: with an alphabet of 4, we have a hypercube (4D) already with a pattern size of 2, provided we stick to a binary pattern in each dimension.

  20. hypercubes at 2 and 4 alphabets 2 character alphabet, pattern size 4 4 character alphabet, pattern size 2

  21. Three Symbols Alphabet suggests fractal representation

  22. 3 fractal enlarge fill in outer pattern repeats inner pattern = self similar = fractal

  23. 3 character alphabet3 pattern fractal

  24. 3 character alphapet4 pattern fractal Conjecture: For n -> infinity, the fractal midght fill a 2D triangle Note: check Mandelbrot

  25. Same for 4 character alphabet 1 position 2 positions 3 positions

  26. 4 character alphabet continued(with cheating I didn’t actually add beads) 4 positions

  27. 4 character alphabet continued(with cheating I didn’t actually add beads) 5 positions

  28. 4 character alphabet continued(with cheating I didn’t actually add beads) 6 positions

  29. 4 character alphabet continued(with cheating I didn’t actually add beads) 7 positions

  30. Animated GIf 1-12 positions

  31. Protein Space in JalView

  32. Alignment of V F A ATPase ATP binding SU(catalytic and non-catalytic SU)

  33. UPGMA tree of V F A ATPase ATP binding SU with line dropped to partition (and colour) the 4 SU types (VA cat and non cat, F cat and non cat). Note that details of the tree $%#&@.

  34. PCA analysis of V F A ATPase ATP binding SU using colours from the UPGMA tree

  35. Same PCA analysis of V F A ATPase ATP binding SU using colours from the UPGMA tree, but turned slightly. (Giardia A SU selected in grey.)

  36. Same PCA analysis of V F A ATPase ATP binding SU Using colours from the UPGMA tree, but replacing the 1st with the 5th axis. (Eukaryotic A SU selected in grey.)

  37. Same PCA analysis of V F A ATPase ATP binding SU Using colours from the UPGMA tree, but replacing the 1st with the 6th axis. (Eukaryotic B SU selected in grey - forgot rice.)

  38. Problems • Jalview’s approach requires an alignment - only homologous sequences can be depicted in the same space • Solution: One could use pattern absence / presence as coordinates

More Related