1 / 17

Origins and impact of constraints in evolution of gene families

Origins and impact of constraints in evolution of gene families Boris E. Shakhnovich and Eugene V.Koonin Genome Research 2006, October 19. Stella Veretnik. Journal Club November 14, 2006. Essential genes and their families: diverge more slowly than non-essential genes

Download Presentation

Origins and impact of constraints in evolution of gene families

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Origins and impact of constraints in evolution of gene families Boris E. Shakhnovich and Eugene V.Koonin Genome Research 2006, October 19 Stella Veretnik Journal Club November 14, 2006

  2. Essential genes and their families: diverge more slowly than non-essential genes diverge to a greater extent than non-essential genes tolerance to mutations -> extent of evolution within the family Why this happens? What parameters are responsible? - unanswered paralogous families with essential genes: E-families evolution through paralogy paralogous families without essential genes: N-families Essential genes definition: Genes that when mutated can result in a lethal phenotype.

  3. Type of selection acting on evolving genes: purifying selection.

  4. What is purifying selection? The ratio Ka/Ks <1 Ka is the number of nonsynonymous mutations per site Ks is the number of the synonymous mutation per site

  5. 1.9 9.2% 1.3 18.4% 13.7 3.5% fraction of essential genes that are not singletons ratio of non-essential to essential genes in E-families Most of essential genes do not have paralogs - Why? No answer in this paper… Is there something special about those which do have paralogs? How can a gene have paralogs and still be essential? - All the paralogs together cannot replace all the function of the essential gene. Once this happens, the gene becomes non-essential.

  6. Divergence and diffusion graph. Edges represent homology relationships Significantly fewer edges between paralogs in E-families How were the families assembled?

  7. Construction of paralogous families. Each ORF is a node on a graph. • Do all-vs.-all Blast comparison of sequences of all translated ORFs within organis 2. Measure amino acid identity level between nodes 3. Translate amino acids to nucleotides and calculate Ks (synonymous substitution per site) and Ka (nonsynonymous substitutions) • The result is 3 weighted graphs (as defined by 1, 2, and 3). • A paralogous family consist of strongly connected components of the graph. • A cutoff of Ks=5 and E-value 1e-15 are used in this work. • In general there is a near-linear dependency of cutoff on Ks.

  8. Do non-essential members always evolve from essential memebers of the family? Largest families What is a typical size of E-family and of N-family? Can a duplicate of non-essential paralog become essential? Are N-families typically larger? Are there more N-families than E-families? Both? How paralogous families evolve: A more typical scenario for N-families After duplication and divergence the following may happen: a. Nonfunctionalization: a duplicate turns into pseudogene More common for E-families b. subfnuctionalization: multiple functions of the ancestral gene are divided between the paralogs c. neofuntionalization: one of the paralogs evolves a new function, the other keeps the old function(s)

  9. Purifying selection is stronger in E-familes (about 2 times) – Ka/Ks ratio is lower in E-families Implication: N-families diverge faster… How this is done: 1. For single feature polymorphism (SFP): check within Saccharomyces cerevisiae 2. For Ka/Ks ratio compare orthologs between closely related species (S.cerevisiae/S.paradoxus – yeast; E.coli K12/CFT073 orthologs )

  10. Rate of conversion to peudogene is substantially higher in N-families 6.8 fold difference

  11. Paralogs get fixated more often in N-families (explains the larger size of N-families?) Equal rate of duplication in E-families and in N-families is assumed. What happens to the paralogs that do not go to fixation? Do they become pseudogenes, something else?

  12. Ks is higher in E-families, than in F-families Implication: paralogs in E-families stick around for a longer time, than in N-families (3 times longer)

  13. Sequence divergence is higher in E-families nonsynonomous substitutions among paralogs within the family sequence identity among paralogs within the family

  14. It is possible to identify E- and N-families using only sequence divergence information. ROC plot Clustering coefficient measures now well connected are the neighbors of a given node in a graph. (true positives) (true negatives)

  15. Transcriptional regulation of paralogs changes more in E-families: paralogs rarely share trancriptional factors ChIP-cip experiments

  16. Summary: Two types of paralogous families exist: E-families and N-families Two type of families have dramatically different dynamics of molecular evolution: E-families diverge slowly, but persist for a long periods of time, thus diverging further than the paralogs in N-families N-families undergoes a more dynamic evolution: many duplicate get fixated, many other become pseudogenes. Level of sequence divergence is significantly lower. Duplicate in E-families typically assume part of the functions from the original gene and/or evolve a new function. This is less so with duplicates in N-families (no data shown for this…) My musings: In a minimalistic organism every gene would be an essential gene. The gene becomes non-essential when its functions are assumed by other gene or split between several genes. Every non-essential gene will go through the stage of being in an E-family in which one there is one essential gene. N-families gradually evolve from E-families, when the essential gene(s) in the family is not essential any longer. This happens when sufficient number of duplicates exist to assure that all function of the original essential gene are covered.

  17. In this scenario, the E-families are the transition link between essential genes on their way to become non-essential. (You could argue that more robust organism has less essential genes…) Non-essential genes (N-families) Essential genes (singleton) Transition to non-essentiality (E-families) very careful creeping forward careless evolution careful evolution Different selection pressures in each category? – Yes. But… how does the behavior of the family changes once it crosses from E-family to N-family?

More Related