1 / 25

Haplotype Discovery and Modeling

Haplotype Discovery and Modeling. Identification of genes. Identify the Phenotype. Map. Clone. QTL Mapping. A QTL (quantitative trait locus) is a gene that affects a quantitative trait, The QTL detected by the markers linked with it is a chromosomal segment,

nuncio
Download Presentation

Haplotype Discovery and Modeling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Haplotype Discovery and Modeling

  2. Identification of genes Identify the Phenotype Map Clone

  3. QTL Mapping • A QTL (quantitative trait locus) is a gene that affects a quantitative trait, • The QTL detected by the markers linked with it is a chromosomal segment, • The DNA structure of a QTL is unknown. Marker 1 QTL Marker 2 Marker 3 . . . Marker k

  4. Aabb aaBb aabb AaBb AABb aabb AaBb AaBb aaBb AABb AAbb AaBb Aabb Aabb aaBb QTL Mapping Based on Linkage 1 2 I 1 2 4 5 II 1 2 3 4 5 6 7 8 III

  5. Mapping and sequencing 10000 Kb Markers 100 Kb DNA clones

  6. SNPs (‘snips’) • A SNP is a site in the DNA where different chromosomes differ in the base they have.

  7. SNPs Paternal allele: CCCGCCTTCTTGGCTTTACA Maternal allele: CCCGCCTTCTCGGCTTTACA Paternal allele : CCCGCCTTCTTGGCTTTACA Maternal allele : CCCGCCTTCTTGGCTTTACA

  8. Sensitive to drug Insensitive to drug HapMap Detecting specific DNA sequence variants that determine complex traits Single Nucleotide Polymorphisms (SNPs) The International HapMap Consortium (Nature, 2003, 2005)

  9. Basic concepts Allele, Haplotype, and Diplotype

  10. Basic concepts Haplotyping a Phenotype Quantitative Trait Nucleotide (QTN)

  11. Basic concepts Risk Haplotype and Composite Diplotype Consider A QTN composed of two SNPs: Risk haplotype: [AB] = R Non-risk haplotype: [Ab], [aB], [ab] = r Composite Diplotype: RR, Rr, rr Illustrations A A A A , B B B B rr (0) RR (2) Rr (1)

  12. Study designA random sample of unrelated individuals from a natural population SNP Group 1 2 Diplotype Obs. Drug Response Trait 1 AA BB [AB][AB] n11/11y1 = (y11, …, y1n11/11)T 2 AA Bb [AB][Ab] n11/10y2 = (y21, …, y2n11/10)T 3 AA bb [Ab][Ab] n11/00y3 = (y31, …, y3n11/00)T 4 Aa BB [AB][aB] n10/11y4 = (y41, …, y4n10/11)T 5 Aa Bb [AB][ab] n10/10y5 = (y51, …, y5n10/10)T [Ab][aB] 6 Aa bb [Ab][ab] n10/00 y6 = (y61, …, y6n10/00)T 7 aa BB [aB][aB] n00/11y7 = (y71, …, y7n00/11)T 8 aa Bb [aB][ab] n00/10y8 = (y81, …, y8n00/10)T 9 aa bb [ab][ab] n00/00y9 = (y91, …, y9n00/00)T

  13. There are two types of parameters: • - Haplotype frequencies (population genetic parameters p) • [AB]: p11 = pq+D • [Ab]: p10 = p(1-q)-D p – Allele (A) frequency at SNP 1 • [aB]: p01 = (1-p)q-D q – Allele (B) frequency at SNP 2 • [ab]: p00 = (1-p)(1-q)+D D – Linkage disequilibrium • Haplotype effects and variation (quantitative genetic para. q) • RR: µ2 = µ + a a = additive effect • Rr: µ1 = µ + d d = dominance effect • rr: µ0 = µ - a Unifying Likelihoodbased on marker (S) and phenotype (y) data Liu, Johnson, Casella and Wu, 2004, Genetics

  14. Modeling Haplotype Frequencies SNP Group 1 2 Diplotype Frequency Obs. 1 AA BB [AB][AB] p211 n11/11 2 AA Bb [AB][Ab] 2p11p10 n11/10 3 AA bb [Ab][Ab] p210 n11/00 4 Aa BB [AB][aB] 2p11p01 n10/11 5 Aa Bb [AB][ab] 2p11p00 n10/10 [Ab][aB] 2p10p01 6 Aa bb [Ab][ab] 2p10p00 n10/00 7 aa BB [aB][aB] p201 n00/11 8 aa Bb [aB][ab] 2p01p00 n00/10 9 aa bb [ab][ab] p200 n00/00

  15. EM algorithm E step M step

  16. Modeling Haplotype Effects SNP Risk Haplotype 1 2 [AB] [Ab] [aB] [ab] 1 AA BB [AB][AB] RR rr rr rr • AA Bb [AB][Ab] Rr Rr rr rr • AA bb [Ab][Ab] rr RR rr rr • Aa BB [AB][aB] Rr rr Rr rr 5 Aa Bb [AB][ab] Rr rr rr Rr [Ab][aB] rr Rr Rr rr 6 Aa bb [Ab][ab] rr Rr rr Rr 7 Aa BB [aB][aB] rr rr RR rr 8 Aa Bb [aB][ab] rr rr Rr Rr 9 Aa bb [ab][ab] rr rr rr RR Likelihood L1 L2 L3 L4 Genotypic values of composite diplotypes: RRu2, Rru1, rru0

  17. Mixture Modelassuming that [AB] is the risk haplotype

  18. EM Algorithm • E step • M step

  19. Hypothesis Testing H0: µ2 = µ1 = µ0 = 0 RR = Rr = rr H1: At least one of equalities in the H0 does not hold LR = –2ln[L0( |y) – L1(|y,S, )] The threshold is determined empirically by permutation tests

  20. Genome-wide Scan Threshold LR SNPs on the Genome

  21. Structural Variation in the Human Genome Haplotype Blocks: Nearby SNPs are often distributed in block-like patterns Hotspots and Coldspots:SNPs from different blocks have larger recombination rates than those from within blocks Tag SNPs:Haplotype diversity within each block can be well explained by a small portion of SNPs. Recombination Hot Spots Block 1 Block 2 Block 3 Block 4 …

  22. A Genetic Study A candidate gene for human obesity SNP A: A, G SNP B: C, G Four haplotypes [AC] [AG] [GC] [GG] • A total of 155 patients selected from a population • Typed for the two SNPs • Measured for body mass index (BMI) • Question: Which haplotype triggers an effect on BMI?

  23. Testing Risk Haplotype LR [AC] 2.32 r [AG] 1.52 r [GC] 3.11 r [GG] 10.35 (p<0.01) R • RR: µ2 = µ + a = 30.83 – 1.77 = 29.06 a = additive effect • Rr: µ1 = µ + d = 30.83 – 3.05 = 27.78 d = dominance effect • rr: µ0 = µ - a = 30.83 + 1.77 = 32.60 • A patient who combines haplotype [GG] with any other haplotypes is normal weight, • A patient who combines any two haplotypes from [AC], [AG] and [GC] is obese, • A patient who has double haplotypes [GG] is overweight

  24. Model Extensions • Block-Block Interactions (Lin et al. 2007, Bioinformatics) • Haplotype-Environment Interactions (Wang et al. 2008, Molecular Pain) • Haplotype Imprinting Effects (Cheng et al., to be submitted) • Multivariate high-dimensional drug response (PK-PD link, efficacy and toxicity…) – A systems approach

  25. 1000-Genome Projects • This sequencing effort will produce most detailed map of human genetic variation to support disease studies • Results will help to design the personalized medication which can optimize drug therapy

More Related