790 likes | 1.08k Views
Population Genetic Structure and Race/Ethnicity. Outline. How is genetic variation structured in the human population? What is the relationship between population genetic variation and race/ethnicity? Racially (ancestrally) admixed populations: African Americans and Latino Americans
E N D
Outline • How is genetic variation structured in the human population? • What is the relationship between population genetic variation and race/ethnicity? • Racially (ancestrally) admixed populations: African Americans and Latino Americans • Interpreting racial/ethnic differences and confounding
Outline • Admixture analysis • Admixture mapping • Pharmacogenetics – irinotecan and colon cancer
What is the evidence regarding genetic structure in the human population?
Genetic Markers • 1900-1980: Blood groups, serum proteins (N of about 40) • 1980-1990 DNA: Restriction Fragment Length Polymorphisms (RFLPs) (biallelic, N of thousands) • 1990-2000: Short Tandem Repeat (STR) Markers (multiallelic, N of tens of thousands) • 2000-2009: Single Nucleotide Polymorphisms (SNPs, N of millions)
Results from Population Genetics Studies – Tree Diagrams • Based on comparisons of populations • Genetic distances are calculated between populations based on allele frequencies at a collection of loci; the larger the allele frequency differences, the greater the distance • Tree diagrams created based on these genetic distances
Bowcock et al, Nature 1994 • 30 microsatellite loci • 14 populations, 148 subjects: • African - CAR pygmy, Zaire pygmy, Lisongo • Caucasian – Northern European, Italians • Oceania – Melanesian, New Guinean, Australian • East Asia – Chinese, Japanese, Cambodian • Americas – Maya, Surui, Karatiana
Calafell et al, Eur J Hum Genet, 1998 • 45 microsatellite loci • 10 populations, 504 subjects • African: CAR pygmy, Zaire pygmy • Caucasian: Dane, Druze • Oceania: Melanesian (Nasioi) • East Asia: Chinese, Japanese, Yakut • Americas: Maya, Surui
Genetic Cluster Analysis • Based on individuals • Genetic marker data used to create clusters of individuals with similar genotype profiles • Number of clusters must be specified • Individuals can be assigned proportionate membership in different clusters
Noah Rosenberg et al, Science, 2002 • Human Genome Diversity Panel • 55 Indigenous Populations from 5 Continents: Africa, Americas, Asia, Europe, Oceania, total of 1,056 people • 377 STR Markers
Jun Li et al, Science, 2008 • Human Genome Diversity Panel, 938 individuals from 51 populations, 5 continents • 650,000 SNP Markers
Principal Components Analysis • Method that can be applied to dense genotype data • Defines a small number of orthogonal variables (linear combinations of genotypes) that explains the variability in a sample • Based on “genetic similarity” matrix of all pairs of individuals in the sample • Can reveal genetic clusters based on degree of relatedness of individuals
Genes mirror geography within Europe John Novembre, Toby Johnson, Katarzyna Bryc, Zoltán Kutalik, Adam R. Boyko, Adam Auton, Amit Indap, Karen S. King, Sven Bergmann, Matthew R. Nelson, Matthew Stephens & Carlos D. Bustamante Nature, 2008
Population structure within Europe. J Novembreet al.Nature000, 1-4 (2008) doi:10.1038/nature07331
Conclusion • The primary source of genetic structure in the human population is based on continent of origin • There are five major groupings – Africa, the Americas, East Asia, Europe/West Asia, and Oceania • These groupings are not perfectly discrete, but neither is genetic variation continuous • An additional, less prominent level of structure exists between national/ethnic groups within the major continental groupings
What about non-Indigenous Populations? • These studies address questions of ancient human evolution, but not recent events. • For example, they are not representative of current Western Hemisphere populations, as in the U.S., Central and South America
How are race, ethnicity and ancestry defined? • Various definitions • Race often defined in terms of geographic ancestry • Ancestry defined in terms of country or nationality of origin • Ethnicity defined in terms of shared socio-political-religious affiliation • All are inter-related
How are race, ethnicity and ancestry defined? • U.S. Census Race Categories: • African/African American/Black • European/White • Asian • Pacific Islander • Native American/Alaskan Native • 2 or More Races • U.S. Census Ethnicity Category: • Hispanic/Latino
What is the evidence regarding genetic structure and race? • How much correlation is there between self-identified race/ethnicity (SIRE) (for example, using the categories above) and genetic structure in the population?
Family Blood Pressure Program (FBPP) • Study of genetic and environmental determinants of hypertension in families • Four networks, 15 field centers (collection sites), four major race/ethnicity groups: Caucasian (CAU), African American (AFR), East Asian (Chinese, Japanese) (EAS), Hispanic (Mexican American) (HIS
FBPPSubjects • Total of 3,636 individuals included (one per family) • CAU 1349, 6 sites • AFR 1308, 4 sites • HIS 412, 1 site • EAS 567 (407 CHI, 160 JAP), 5 sites • 18 SIRE-site combinations total
FBPPGenetic Markers • Genome Screen STR markers, all typed at the NHLBI sponsored Mammalian Genotyping Service, Marshfield, WI • Total number of markers included = 366. • Genetic distance analysis among SIRE-site groups • Genetic cluster analysis
Multidimensional Scaling • Based on distance matrix calculated by allele frequency differences between population groups. • Provides a 2-dimensional picture of distance relationships among the populations
GCA Classification versus SIRE • Concordant: 3,631 • Discordant: 5 • Discordance Rate: .0014 • Conclusion: Very high correspondence between race/ethnicity groupings and genetic clusters
Analysis of Group Differences • For the major race/ethnicity groups, SIRE and GCA give nearly identical results with enough genetic markers • Important environmental/social/cultural differences also exist between SIRE groups • Therefore, race/ethnicity represent both social and genetic factors.
Analysis of Group Differences • High correlation between SIRE and GCA leads to strong confounding between genetic and non-genetic factors when examining group differences in prevalence of diseases or traits. • Therefore, no inferences can be made about etiology of group differences from the observed differences alone.
Example – Kistka et al, Am. J. Obstetrics and Gynecology, 2007, “Racial disparity in the frequency of recurrence of preterm birth”: “In this report, we further analyzed the pattern of recurrent preterm birth stratified by race and found that the tendency to repeat preterm birth during the same week occurs for both whites and blacks, but the median age for preterm birth is shifted 2 weeks earlier in blacks. These findings together highlight the importance of race, particularly after correction for other risk factors, and suggest a probable genetic component that may underlie the public health problem presented by the racial disparity in preterm birth.”
Lack of evidence of explanatory factors does not imply a genetic cause for group difference Genetic explanations need to be direct, not indirect Genetic explanations should not be the default position
Genetic Admixture • Even though the four ethnic groups were easily separable based on genetic markers, African Americans and Latino Americans typically have ancestry from multiple continents. Using the same genetic markers, it is possible to estimate for each individual the proportions of ancestry, or individual ancestry (IA) from each continental/ancestral group.
Admixture Estimates - FBPP • Estimation of ancestry requires genotypes of individuals representing the original indigenous ancestors. These analyses included 1,378 unrelated Caucasians from the FBPP, 127 unrelated sub-Saharan Africans and 50 Native Americans from the World Diversity Panel.
Admixture Analysis • Distinguishing between genetic and non-genetic sources of group differences can be examined within a single admixed population. • Depends on variation in admixture levels within that population • Examine correlation of individual ancestry (IA) with trait of interest (e.g. does blood pressure correlate with African ancestry?)
Admixture Analysis - FBPP • 3,207 African Americans • 1,506 Mexican Americans • Estimated IA and its correlation with blood pressure, hypertension, and BMI
Admixture Analysis • Caveat: Still possibly subject to residual correlation and confounding • For example, within African Americans, discrimination may be related to both skin pigment and adverse health outcomes • Skin pigment is likely to be genetically correlated with degree of European versus African ancestry
Admixture Mapping • As opposed to ancestry estimates based on the entire genome, which may be confounded with non-genetic factors, ancestry at specific genetic locations are less likely to be so confounded • The power of the method depends on how large the effect of an allele is on the trait, and the difference in the frequency of that allele between ancestral groups
Admixture Mapping • If the admixture occurred recently in history (e.g. over the past 10 generations), then the ancestry excess will extend over large segments of the chromosome • Thus, markers in the vicinity of the trait locus will also show excess ancestry from the population with the higher allele frequency