E N D
1. Introduction to Genetic Epidemiology 5th Annual Interdisciplinary Genetic Research Course:
Medical, Public Health, Biostatistical & Bioethical Approaches
October 6, 2008
T.H. Beaty, Johns Hopkins School of Public Health
2. Genetic Epidemiology
3. Landmarks in Genetics
4. Landmarks in Genetics (contd)
5. Summarizing Genetic History
6. Landmarks in Epidemiology
7. Landmarks in Epidemiology
8. 50 years of genetic epidemiologyMorton (2006) J HUM GENET 51:269-277 Before DNA polymorphisms (1956-1979)
Linkage analysis (LOD scores) began with very few markers
Population genetics measured linkage disequilibrium (LD) by D & r2
Heritability (h2) began with twin studies & evolved into path analysis with genetic & non-genetic causal factors
Pre-genome period (1980-2001)
Linkage analysis expanded to multipoint analysis
Even for Mendelian diseases, peaks are hard to narrow
With complex phenotypes, it is worse
Association studies proposed as alternative (Risch & Merikangas 1996 SCIENCE 273:1516-1517)
Family based association tests are useful alternative
9. 50 years of genetic epidemiology (contd) Post-genome period (>2002)
Sequence of the human genome allows greater understanding of structure of genes, their physical position & (maybe) their function
HapMap offers virtually unlimited markers, but SNPs are not equally polymorphic in all populations
LD & haplotype blocks vary among populations
Associated markers are only predictive, but can identify causal genes
Study design will be critical
Is 500K always better than 100K?
10. Comparing genetics & epidemiology
11. Comparing genetics & epidemiology (contd)
12. Central questions in Genetic Epidemiology Does the trait cluster in families?
Can familial clustering be explained by genes or shared environment?
What is the best model of inheritance?
Can we locate genes for complex diseases/traits?
How does the gene control risk of disease?
13. Extending basic questions in genetic epidemiology(Burton et al. 2005 Lancet 366:941-951)
14. Useful References from Lancet 2005
15. Study designs for central questions
16. Study design for central questions (contd)
17. Nature is not linear
18. Different levels of study
19. Study designs can (& do) overlap
20. 1. Population based designs Population comparisons (=ecological design)
Migrant studies
Do people who move from a low risk environment to a high risk environment change their risk?
Consider issues of self-selection, assimilation, etc.
Admixture studies
Does disease risk parallel genetic admixture (% of genes of distinct ancestry)?
Admixture is only estimated
Human populations are not constant
Vital records can be an important resource, especially birth defects & disease registries
Does risk of disease change among offspring of incross vs. outcross matings?
21. 2. Case-Control Designs Case-unrelated control can identify genetic risk factors
Genetic index (e.g. inbreeding)
Genetic marker
Genetic marker can be a risk factor due to
Direct effect of marker in causal pathway
Indirect effect due to linkage disequilibrium (LD)
association between a high risk allele at an unobserved causal gene & observed marker allele
22. 2. Case-control designs (contd) Conventional case-control design:
Representative sample from case & control populations
Tests for difference in allele or genotypic frequencies
Problems with confounding (population stratification)
Case-related control design
Representative sample of cases & their unaffected sibs (or cousins)
Minimize chances of confounding
Overmatched for genetic background: less statistical power
Can test for linkage directly
23. Variations on case-control design Incomplete case-controls designs can test for Gene-Environment interaction (GxE)
Case-only designs
Incomplete variations (G & E on cases, only E on controls, etc.)
Family based controls
Create controls from parental mating type
Under Ho, marker alleles are transmitted to case as often as not
Rejecting Ho implies linkage & linkage disequilibrium (=association)
Simplex families can now contribute to tests for linkage
24. 3. Family Designs
25. Families come in different shapes & sizes Sample fixed sets of relatives
Adoption studies address fundamental questions about genes vs. environment
Adoptee, adoptive parents, biological parents, unrelated sibs in adoptive family
Twins: estimate heritability by comparing MZ & DZ twins
Affected sib pairs (+parents) to test for linkage
Are these representative of all families?
26. Families come in different shapes & sizes (contd) Sample nuclear families (parents & offspring)
Measure familial aggregation/correlation
Fit models of inheritance
Collect data on family history in extended families
Expected risk of disease can be computed as (person-years at risk) * (age specific risk)
Requires good information on baseline incidence rates
Expected number of cases (E) based on population risk per person-year
Observed number of cases (O) typically by report
Compute Family History Score as Poisson statistic:
27. Family History Scores Summarize familial risk in families ascertained through probands (cases/controls)
Kerber (1995 GENET EPI 12:291-301) Breast cancer cases & controls drawn from the Utah Population Data Base
Can be used to identify highest risk families
Schwartz et al (1988 AM J EPI 128:524-535) Cancer risk in families of cases drawn from a cancer registry
Can be useful for public health
28. Public Health uses for family history
30. CDC resources:
31. CDC resourceshttp://hugenavigator.net/
32. If you sample families in a representative manner,
Quantitative traits or a common qualitative phenotype can be used to
Estimate heritability (h2) or
Find best fitting model of inheritance (segregation analysis)
If genetic markers are available, these families can be used to
Test for linkage to unobserved genes controlling qualitative phenotype
Drs. Liang & Xu will discuss this tomorrow
Search for quantitative traits loci (QTL) that control quantitative phenotypes
33. Family studies & representative sampling (contd) Joint models for segregation analysis & linkage are feasible
Linkage analysis is still limited to families informative for meiosis
Multiplex families with >1 affected
Simplex families have only 1 affected member
Linkage will always reflect a subset of all families
Heterogeneity between simplex & multiplex families should be considered
34. Families ascertained through proband Proband (typically affected) brings the rest of family into the study
Segregation analysis can identify the best model of inheritance if ascertainment is considered
Models have many parameters to estimate
Even so they may not completely correct
Families vary considerably in information content
Correcting for ascertainment bias is necessary
35. Linkage vs. Association Requires multiplex families
Bigger is better
Guaranteed to work for Mendelian diseases
Genome wide studies are feasible
Still useful for complex diseases
Locus heterogeneity (linked & unlinked families) is a problem
Meta-analysis may strengthen evidence but narrowing peaks is still hard Unrelated cases & controls can be used
Can incorporate tests for G, E, GxE, GxG, etc.
Meta-analysis can measure consistency across studies
Or lack thereof
Allelic heterogeneity is a problem
Different high risk alleles
Genome wide studies are now feasible (but expensive)
Interpreting them is a challenge
36. Genes as risk factors Epidemiology study designs treat genetic markers as a risk factor
Test Ho: Genotype (G) is independent of risk, P(case)
Odds ratio (OR) measures association between marker & risk of disease
OR(case|G+)=(AD)/(CB)
Dr. Liang will discuss this
37. What can you do with a genetic risk factor? Are genes just inherited risk factors?
How can you use genetic risk factors in public health?
Causal mutations can be used to screen
Women at high risk of breast cancer for BRCA1 & 2 mutants
Couples at risk of having CF child
Linked markers can be used for genetic counseling or mapping
Genetic markers that are true risk factors can be used in screening
But you must be confident in the estimated risks
e4 for Alzheimiers Diseae?
Is there an intervention?
These may depend on population or environment
38. Big Picture:Public Health Genetics is different from Genetic Epidemiology Public health genetics is broader than genetic epidemiology
Application vs. Research
Screening, intervention, & treatment are part of public health genetics
Policy is key part of public health genetics
39. Public Health Genetics (contd) Deals with both Mendelian & complex diseases
Mendelian diseases in the aggregate are a major public health burden
Screening the population can identify high risk individuals or groups
Screening for complex diseases will be more demanding & will require greater efforts to validate estimates of risk
40. Trends in science: Genomic Medicine & Human Genome Epidemiology Khoury, Little & Burke (2004) Human Genome Epidemiology Oxford Univ Press
Recent advances in genetics hold considerable promise for medicine & public health
Many reports of genes for common diseases, few are consistent
There is some hype involved
What to do with new information as it emerges?
How to validate them?
How to act on them?
41. Trends in science (contd) Genomic medicine could predict risk of common diseases based on genotypes
Genetic vs genomic ?One gene vs. many genes
Pharmacogenomics could tailor pharmaceutical treatment based on genotypes
Both require solid epidemiologic data to generate & confirm predictive value of genotype on risk
This requires many studies, not one
This may vary among populations
This may depend on environment
42. Continuum from gene discovery to disease prevention (Khoury et al, 2004)
43. Summary of introduction Genetic epidemiology is a wide ranging scientific discipline
Focus on identifying genes involved in complex diseases
Variety of study designs are used
Variety of statistical methods are available
Complex diseases are complex
Nature has many surprises awaiting us