1 / 59

Analysis of whole genome association studies in pedigreed populations

Analysis of whole genome association studies in pedigreed populations. Goutam Sahana Genetics and Biotechnology Faculty of Agricultural Sciences Aarhus University, 8830 Tjele, Denmark. Concept of mapping. Identification of genetic variant underlying disease susceptibility or a trait value.

sandra-neal
Download Presentation

Analysis of whole genome association studies in pedigreed populations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of whole genome association studies in pedigreed populations Goutam Sahana Genetics and Biotechnology Faculty of Agricultural Sciences Aarhus University, 8830 Tjele, Denmark

  2. Concept of mapping Identification of genetic variant underlying disease susceptibility or a trait value Evidence for the location of the gene = Causal variant

  3. Approaches to Mapping • Candidate gene studies • Association • Resequencing approaches • Genome-wide studies • Linkage analysis • Genome-wide association studies (Linkage disequilibrium, LD mapping)

  4. Linkage mapping • Look for marker alleles that are correlated with the phenotype within a pedigree • Different alleles can be connected with the trait in the different pedigrees

  5. Association mapping • Marker alleles are correlated with a trait on a population level • Can detect association by looking at unrelated individuals from a population • Does not necessarily imply that markers are linked to (are close to) genes influencing the trait.

  6. Linkage vs. association Unlikely to exist Linkage analysis Effect Association study Very difficult Freq. of causal variant Modified from D. Altschuler

  7. Linkage vs. association Hirschhorn & Daly, Nature Rev. Genet. 2005

  8. Allelic Association • Direct Association • Allele of interest is itself involved in phenotype • Indirect Association • Allele itself is not involved, but due to LD with the functional variant • Spurious association • Confounding factors (e.g., population stratification)

  9. Linkage disequilibrium • Non random association between alleles at different loci. Loci are in LD if alleles are present on haplotypes in different proportions than expected based on allele frequencies • Two alleles that are in LD are occurring together more often than would be expected by chance

  10. Linkage disequilibrium Locus A: Alleles A & a; freq. PA & Pa Locus B: Alleles B & b; freq. PB & Pb A b a B a b A B Possible haplotyoes Expected frequencies: pApB pApb papB papb Observed frequencies: pAB pAb paB pab D = pAB - pApB ≠ 0

  11. LD variation across genome • The extent of LD is highly variable across the genome • The determinants of LD are not fully understood. • Factors that are believed to influence LD • Genetic drift • Population growth • Admixture or migration • Selection • Variable recombination rates

  12. Haplotype Genotypes Locus1 2 4 Locus2 1 3 Locus3 3 2 Locus4 4 1 Locus5 2 3 Locus6 1 2 Haplotypes 2 3 2 4 3 1 4 1 3 1 2 2 Identification of phase PHASE BEAGLE

  13. Haplotype-based analysis • Increased ability to identify regions that are shared identical by descent among affected individuals • Haplotypes may the causative ‘composite allele’ rather than a particular nucleotide at a particular SNP • Haplotype analysis is meaningful only if SNPS are in themselves in LD

  14. Monogenic verses Complex traits

  15. Monogenic trait • Mutation in single gene is both necessary and sufficient to produce the phenotype or to cause the disease • The impact of the gene on genetic risk is the same in all families • Follow clear segregation pattern in families • Typically rare in population

  16. Complex trait • Multiple genes lead to genetic predisposition to a phenotype • Pedigree reveals no Mendelian pattern • Any particular gene mutation is neither sufficient nor necessary to explain the phenotype • Environment has major contribution • We study the relative impact of individual gene on the phenotype

  17. Some examples

  18. Quantitative Trait A biological trait that shows continuous variation rather than falling into distinct categories Quantitative trait locus (QTL) - Genetic locus that is associated with variation in such quantitative trait

  19. Assessing genetic contributions to complex traits • Continuous characters (wt, blood pressure) • Heritability: Proportion of observed variance in phenotype explained by genetic factors • Discrete characters (disease) • Relative risk ratio: λ= risk to relative of an affected individual/risk in general population • λ encompasses all genetic and environmental effects, not just those due to any single locus

  20. Factors that influence identification of allelic association • Effect size • Linkage disequilibrium • Disease and marker allele frequencies • Sample Size Reviewed by Zondervar & Cardon, Nature Rev. Genet. 2004

  21. Odds ratio

  22. Sample size No. of cases= no. of controls; D’=0.7; power 80%;  =0.001 Zondervar & Cardon (Nature Rev. Genet. 2004)

  23. Population stratification Consider two case/control samples, genotyped at a marker with alleles M and m Sample A Sample B 2 NS 2 NS

  24. Population stratification Sample A Sample B 2 =14.8 P<0.001

  25. 2 No stratification E(2) Unlinked ‘null’ markers Test locus 2 E(2) Stratification Adjust test statistics Dealing with population structure • Genomic control (Devlin and Roeder, 1999) • Inflate the distribution of the test statistic by λ. • λ estimated from data

  26. Dealing with population structure • Structured association (Pritchard et al., 2000) • Discover structure from set of unlinked markers, i.e. assign probabilities of ancestry from k populations to each individual, and then control for it.

  27. Association analysis approaches • Case–control studies • Markers frequencies are determined in a group of affected individuals and compared with allele frequencies in a control population • Family based methods • Based on unequal transmission of alleles from parents to a single affected child in each family. Associations are summed over many unrelated families

  28. Case-Control studies: 2 test Alleles Genotypes 2x3 contingency table 2x2 contingency table Test of independence: 2 = (O-E)2/E with 2 or 1 df

  29. Family based tests • Genotypes from independent family trios where the child is affected • Use the non-transmitted genotypes or alleles as internal controls to the transmitted ones

  30. Family-based association studies ? ? 1 4 transmitted 2 3non-transmitted 1 2 3 4 control 14 Is an allele transmitted more often than it’s not transmitted to affected offspring ?

  31. TDT: Transmission Disequilibrium Test Non-transmitted G g G/g G/G ab cd G g Transmitted G/g TDTG = (TG-NTG)2/(TG+NTG) =(b-c)2/(b+c) ~ 21

  32. TDT: Transmission Disequilibrium Test • Multiallelic markers • ETDT (Sham & Curtis, 1995) • Missing parent genotypes • TRANSMIT (Cayton,1999) • Haplotypes • TDTHAP (Clayton & Jones, 1999) • Sibs • TDT/STDT (Spielman & Ewens, 1998) • Pedigrees • PBAT (Martin et al, 2000) • Quantitative traits • QTDT (Abecasis et al. 2000)

  33. Some limitations • Subjects – random or structure family • Parents not available • Difficult when there are very many genes individually of small effect • Environmental influence may obscure genetic effects • Genetic heterogeneity underlying disease phenotype • Hidden (unaccounted) relationship

  34. Rare allele Single family is segregating A a B b Offspring groupI Offspring group II

  35. Complex pedigree & Quantitative traits

  36. Complex pedigree • Non-independence among pedigree members • Only polygenic relationship is not sufficient • Association analysis should account for the point-wise relationship among individuals • Identical-by-decent probabilities

  37. Methods • Combined linkage and LD • Generalized linear models • Mixed-model (Yu et al. 2006) • Bayesian approach

  38. Combined linkage and LD Phenotype= Fixed factors + Polygene + Haplotype • Polygene – the whole relationship in pedigree is used • Identical-by-descend coefficients were estimated for point-wise relationship Phase determination - GDQTL QTL mapping - DMU

  39. QTL for Clinical Mastitis in cattle LA

  40. QTL for Clinical Mastitis in cattle LA LD

  41. QTL for Clinical Mastitis in cattle LD/LA LA LD

  42. Simulation • 100 half-sib families (Dairy cattle pedigree) • 2000 progeny • 5 chromosomes – 100 cM (each) • SNP – 5000 • 15 QTL (1QTL-10%, 4QTL-5 %, 10QTL–2%) • 50% of the genetic variance • Heritability – 30%

  43. Generalized linear models Phenotype= Sire-family + genotype Software – TASSEL http://www2.maizegenetics.net/index.php?page=bioinformatics/tassel

  44. Generalized linear models

  45. Generalized linear models

  46. Generalized linear models

  47. Mixed-model (Yu et al. 2006) Phenotype= Fixed factors + SNP + Population + polygene 0 1 2 STRUCTURE Relationship SAS mixed model (Gael Pressoir)

  48. Mixed-model

  49. Mixed-model

More Related