1 / 20

Biostatistics-Lecture 19 Linkage Disequilibrium and SNP detection

Biostatistics-Lecture 19 Linkage Disequilibrium and SNP detection. Ruibin Xi Peking University School of Mathematical Sciences. Haplotype Freqeuncies. Linkage Equilibrium. Linkage Disequilibrium. Disequilibrium Coefficient D AB. D AB is hard to interpret. Sign is arbitrary …

Download Presentation

Biostatistics-Lecture 19 Linkage Disequilibrium and SNP detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biostatistics-Lecture 19Linkage Disequilibrium and SNP detection Ruibin Xi Peking University School of Mathematical Sciences

  2. HaplotypeFreqeuncies

  3. Linkage Equilibrium

  4. Linkage Disequilibrium

  5. Disequilibrium Coefficient DAB

  6. DAB is hard to interpret • Sign is arbitrary … • A common convention is to set A, B to be the common allele and a, b to be the rare allele • Range depends on allele Frequencies • Hard to compare between markers

  7. r2 (also called Δ2) • Ranges between 0 and 1 • 1 when the two markers provide identical information • 0 when they are in perfect equilibrium

  8. Raw r2 data from chr22

  9. Comparing Populations CEPH: Utah residents with ancestry from northern and western Europe (CEU)

  10. Use LD for SNP imputation and detection fastPhase

  11. Use LD for SNP imputation and detection fastPhase

  12. Model for haplotypes • Observed n haplotypes • Each with M markers • bij = 0, 1 • Assume each haplotye originates from one of K clusters • zi: unknown cluster of origin of bi • Since clusters of origin are unknown

  13. Local clustering of haplotype • Assume zi = (zi1,…, ziM) forms a Markov chain on {1,…,K} • zim denote the cluster origin for bim • Initial probabilities • Transition probabilities • Conditional on the cluster of origin • Marginal

  14. Local clustering of genotype data • We have genotype data • gim: genotype at marker m of individual i • Take values 0, 1, 2 • Initial probabilities ( unordered cluster of origins) • Transition probabilities

  15. Local clustering of genotype data • Genotype probabilities conditional on cluster of origins • Joint likelihood

  16. Algorithms for genotype imputation • fastPhase • BEAGLE • IMPUTE • PLINK • MaCH

  17. Algorithms for genotype imputation • fastPhase • BEAGLE • IMPUTE • PLINK • MaCH Picture taken from IMPUTE v2

  18. SNP detection with LD information • MaCH: (G: genotye, S: cluster)

  19. SNP detection with LD information • For sequencing data G is not observed • Coverage of base A, B are observed, we have the HMM

  20. SNP detection with LD information Nielsen et al. 2011 Nature Review Genetics

More Related