1 / 67

Evidence of Selection on Genomic GC Content in Bacteria

Evidence of Selection on Genomic GC Content in Bacteria. Falk Hildebrand Adam Eyre-Walker. Genomic G+C content. Genomic GC content. Codons. Non-synonymous. 2-fold : TTT TTC 4-fold : CCT CCC CCA CCG. ATA CCC CTA CCT. GCT 123.

joben
Download Presentation

Evidence of Selection on Genomic GC Content in Bacteria

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evidence of Selection on Genomic GC Content in Bacteria Falk Hildebrand Adam Eyre-Walker

  2. Genomic G+C content

  3. Genomic GC content

  4. Codons Non-synonymous 2-fold : TTT TTC 4-fold : CCT CCC CCA CCG ATA CCC CTA CCT GCT 123 Synonymous

  5. Genomic GC content

  6. Variation

  7. Correlations

  8. Explanations • Mutation bias • Suoeka (1961) & Freese (1962) • Intrinsic and/or extrinsic • Selection • Many authors • Biased gene conversion • Anonymous referees

  9. Correlates • Genome size • positive correlation • Lifestyle • higher GC in free living • Aerobiosis • higher GC in aerobic • Nitrogen utilization • higher amongst N fixers • Temperature • higher amongst thermophiles?

  10. Evidence of selection I • Escherichia coli • Mutation pattern • 273 GCAT versus 131 ATGC • Predicted GC content = 0.32 • Observed GC content = 0.50 • Observed GC at neutral sites = 0.58 Lynch (2007) Origins of genome architecture

  11. Evidence of selection II • Phylogenetic analyses • Mycobacterium leprae(Lynch 2007) • Escherichia coli (Balbi et al. 2009) • 5 pathogenic bacteria (Hershberg and Petrov 2010)

  12. Phylogenetic analysis G A A G G G

  13. Evidence of selection II • Phylogenetic analyses • Mycobacterium leprae(Lynch 2007) • Escherichia coli (Balbi et al. 2009) • 5 pathogenic bacteria (Hershberg and Petrov 2010) • Excess of GC  AT

  14. Test of mutation bias • If GC content is • Due to mutation bias alone • Stationary • And the infinite sites assumption holds • Then • # GCAT mutations = # ATGC mutations

  15. Why? • If GC stationary • #GCAT subs = #ATGC subs • All neutral mutations have same chance of fixation • #GCAT muts = #ATGC muts

  16. Identifying mutations Strain 1 ACT GCT TTG GCT TTA TGG Strain 2 ACT GCT TTG GCT TTA TGA Strain 3 ACT GCT TTG GCT TTA TGG Strain 4 ACT GCT TTCGCT TTA TGA Strain 5 ACC GCT TTC GCT TTA TGG Strain 6 ACT GCT TTG GCT TTA TGG TC CG GA

  17. Orienting mutations Outgroup ACT GCT TTC GCT TTA TGG Strain 1 ACT GCT TTG GCT TTA TGG Strain 2 ACT GCT TTG GCT TTA TGA Strain 3 ACT GCT TTG GCT TTA TGG Strain 4 ACT GCT TTCGCT TTA TGA Strain 5 ACC GCT TTC GCT TTA TGG Strain 6 ACT GCT TTG GCT TTA TGG TC CG GA GCAT = 1 ATGC = 1

  18. Orienting mutations Strain 1 ACT GCT TTG GCT TTA TGG Strain 2 ACT GCT TTG GCT TTA TGA Strain 3 ACT GCT TTG GCT TTA TGG Strain 4 ACT GCT TTCGCT TTA TGA Strain 5 ACC GCT TTC GCT TTA TGG Strain 6 ACT GCT TTG GCT TTA TGG TC GC GA GCAT = 1 ATGC = 1

  19. Test of mutation bias • If GC content is • Due to mutation bias alone • Stationary • And the infinite sites assumption holds • Then • # GCAT = # ATGC

  20. Four-fold synonymous sites

  21. Codons Non-synonymous 2-fold : TTT TTC 4-fold : CCT CCC CCA CCG ATA CCC CTA CCT GCT 123 Synonymous

  22. Data • Popset • Keyword “bacteria” • 8 or more sequences from same species • 149 bacterial species • 8 phyla, 15 classes and 77 genera • 1 or more genes • 10 or more synonymous polymorphisms • 4-fold diversity < 0.1

  23. Overall result P<0.0001

  24. Bias versus GC4 GCAT Z = GCAT

  25. Phylogenetic distribution

  26. Potential problems • Infinite sites assumption • Sequencing error

  27. Infinite sites assumption • Each mutation occurs at a site which is not polymorphic

  28. Infinite sites assumption • If GC content stationary • #GCAT subs = #ATGC subs • All neutral mutations have same chance of fixation • #GCAT muts = #ATGC muts

  29. Finite sites assumption • If GC content stationary • #GCAT subs = #ATGC subs • All neutral mutations have same chance of fixation • #GCAT muts = #ATGC muts • But some mutations not evident as poly

  30. Finite sites • GC rich sequence • Implies • rate of ATGC > rate of GCAT • Mutation rate low • #ATGC poly = # GCAT poly • Mutation rate high • #ATGC poly < # GCAT poly

  31. Finite sites theory uμ GC AT vμ Assume : stationary popn stationary GC

  32. Finite sites theory

  33. Finite sites theory 0.95 0.9 0.8 0.7 0.6

  34. Predicting Z • Assume • finite sites • neutrality • Use GC4 to get f • Use observed diversity to estimate μ • Predict Z

  35. Zpred

  36. Z-Zpred

  37. Mutation rate variation

  38. Z-Zpred (exponential rates)

  39. Sequencing error

  40. Explanations • Non-stationary base composition • Selection for translational efficiency • Biased gene conversion • Selection upon base composition

  41. Explanations • Non-stationary base composition • Selection for translational efficiency • Biased gene conversion • Selection upon base composition

  42. Non-stationary GC content

  43. Non-stationary base composition

  44. Explanations • Non-stationary base composition • Selection for translational efficiency • Biased gene conversion • Selection upon base composition

  45. Selection on codon usage

  46. Translational efficiency

  47. Explanations • Non-stationary base composition • Selection for translational efficiency • Biased gene conversion • Selection upon base composition

  48. Biased gene conversion A T A G C G C G C T C G

  49. Four gamete test G A G T C A C T G A G T C A No recombination Recombination

  50. Biased gene conversion

More Related