1 / 17

Variation structure

BI420 – Introduction to Bioinformatics. Variation structure. Gabor T. Marth. Department of Biology, Boston College marth@bc.edu. Human variation structure is heterogeneous. chromosomal averages. polymorphism density along chromosomes. marker density. “dense”. “sparse”. allele frequency.

Download Presentation

Variation structure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BI420 – Introduction to Bioinformatics Variation structure Gabor T. Marth Department of Biology, Boston College marth@bc.edu

  2. Human variation structure is heterogeneous chromosomal averages polymorphism density along chromosomes

  3. marker density “dense” “sparse” allele frequency “common” “rare” Heterogeneity at the level of distributions

  4. What explains nucleotide diversity? G+C nucleotide content CpG di-nucleotide content recombination rate 3’ UTR 5.00 x 10-4 5’ UTR 4.95 x 10-4 Exon, overall 4.20 x 10-4 Exon, coding 3.77 x 10-4 synonymous 366 / 653 non-synonymous 287 / 653 functional constraints Variance is so high that these quantities are poor predictors of nucleotide diversity in local regions hence random processes are likely to govern the basic shape of the genome variation landscape  (random) genetic drift

  5. Components of drift: Genealogy randomly mating population, genealogy evolves in a non-deterministic fashion present generation

  6. Components of drift: Mutation mutation randomly “drift”: die out, go to higher frequency or get fixed

  7. Modulators: Changing population size mutation randomly “drift”: die out, go to higher frequency or get fixed genetic bottleneck

  8. Modulators: Population subdivision subdivision promotes private polymorphisms, and skews allele frequency subdivision

  9. Modulators: Recombination acagttatgcaga acagttatgtaga accgttatgcaga accgttatgtaga accgttatgcaga acagttatgtaga recombination different nucleotide sites within the same DNA segment no longer share the same genealogy

  10. Modulators: Natural selection negative (purifying) selection positive selection the genealogy is no longer independent of (and hence cannot be decoupled from) the mutation process

  11. Modeling ancestral processes “forward simulations” the “Coalescent” process By focusing on a small sample, complexity of the relevant part of the ancestral process is greatly reduced. There are, however, limitations.

  12. Inferences from variation data larger mutation rate (μ) -> more mutations -> higher diversity (θ) larger population size (N) -> more mutations -> higher diversity (θ) higher diversity -> larger population size OR higher mutation rate (θ = 4Nμ)

  13. Ancestral inference: modeling bottleneck stationary collapse expansion past history present MD (simulation) AFS (direct form)

  14. Ancestral inference: model fitting modest but uninterrupted expansion bottleneck

  15. Allelic association acagttatgcaga accgttatgcaga acagttatgtaga higher recombination rate (r) accgttatgtaga possible allele combinations (2-marker haplotypes)

  16. Allelic association: LD measure of allelic association: “linkage disequilibrium (LD)”

  17. Haplotype structure “haplotype block”

More Related