160 likes | 311 Views
What is the value of indigenous populations to medical genetics research?. Rosalind M. Harding Departments of Zoology and Statistics University of Oxford. The value of indigenous populations is given by…. An evolutionary framework for studies in medical genetics
E N D
What is the value of indigenous populations to medical genetics research? Rosalind M. Harding Departments of Zoology and Statistics University of Oxford
The value of indigenous populations is given by… • An evolutionary framework for studies in medical genetics • Hypotheses: to explain disease • Data: distributions of polymorphism, especially in a wide range of indigenous populations • Database resources: e.g. The Human Genome Diversity Project (HGDP) • Theory and statistics: including the concept of population • Aims relevant to medical genetics: • to understand geographically variable selective pressures • to understand the genetic demography of modern humans • An aim that isn’t relevant: reconstructing population divergence history and tracing ancestry to particular (racial) sub-populations.
Evolutionary hypotheses for disease • Mendelian paradigm: disease alleles are rare, due to purifying selection, and therefore, recent. • James Neel’s ‘thrifty gene’ hypothesis (1962): metabolically thrifty alleles have been advantageous in populations subject to famine (eg Pacific Islanders, Arizona Pima Indians) but now cause diseases associated with obesity, eg diabetes • Common Disease/Common Variant (CDCV) hypothesis: alleles which contribute to disease risk may be common polymorphisms… • And old, due to drift and weak negative selection in a population of small effective size, and subsequent population expansion (Lander, 1996, Science 274:536-538) • And old, due to past positive selection - adaptation to ancestral Pleistocene environments in Africa (Di Rienzo & Hudson, 2005, TIG 21:596-601) • And young, due to recent positive selection - adaptation to environments outside of Africa (Young et al., 2005, PLOS Genetics 1(6):e82)
Features of polymorphism data • Many polymorphic autosomal genes have total coalescence times in range 0.5 – 2 million years • Expect many alleles older than 50,000 years - these are most likely to have arisen in Africa • Typically, common SNPs are shared between Africa, Europe and Asia, though SNP allele frequencies vary. (Much less sharing of haplotypes) • Differentiation is modest: FST ~ 10-15% • Common neutral SNPs are likely to be older than 50,000 years, but SNPs can be become common in much less time under selection.
The Human Genome Diversity Project • What is it? A resource for studies of genetic diversity • Aim: to promote the understanding of how, when and why patterns of human genetic diversity formed • Added benefit and justification: to prove useful to biomedical research • Troubled history since 1994. Why? • A fear that indigenous people might be exploited, for example, by the use of their DNA for commercial purposes (bio-piracy). • Uncertainties about the strategy for collecting samples (focus on anthropologically interesting populations or aim for grid-like geographic coverage?) Cavalli-Sforza LL (2005) The Human Genome Diversity Project: past, present and future. Nature Reviews Genetics 6:333-340.
The Human Genome Diversity Project: current status • 1,064 cell lines from 52 indigenous populations • Housed and distributed by the Centre for the Study of Human Polymorphism (CEPH) at the Fondation Jean Dausset in Paris • Sampling strategy guided by anthropology
What have we learnt from HGDP samples? • 93-95% of microsatellite allele frequency differences are found within 52 sample populations and 5-7% (FST) between. • That sample populations can be clustered, eg for k=5 FST=3-5% • Inferences: • clusters represent ancestral populations • clustering reconstructs an evolutionary divergence history (roughly the same as found by other genetic analyses, eg mtDNA) • Relevance for medical genetics? Rosenberg et al. (2002) Science 298: 2381-2385
Some theory: the island model concept of population • Cultural/ethnic/biological identity of individuals defines population membership • Populations approximate Mendelian gene pools (random mating within, restricted mating between) • Isolation of populations leads to genetic divergence (FST) • The greater the genetic difference in the present, the older the divergence • Genetic divergence builds up in a hierarchy of splits (phylogeny) • Therefore, if we accept the island model concept of population we can reconstruct evolutionary history. • But is this ambition relevant to medical genetic studies? No…..and yes!
Are patterns of polymorphism better described by gradients? • If more of the variance in allele frequencies is explained by correlations with geography, use geography rather than FST. • Gradients reveal the importance of geographic and ecological variables. They may be due to: • Isolation of individuals (rather than population units) by distance • Geographically variable selection • An old example: • a+-globin deletion variants (causing a+-thalassaemia in homozygotes but giving protection against malaria) are common in Oceania • frequencies are correlated with malarial endemicity, following latitudinal and altitude changes in selection pressure. • population history is not irrelevant (a+-thalassaemia is present in Polynesia where there is no malaria) but it is a minor factor • A new example: Serre & Pääbo (2004)
Serre & Pääbo, 2004 Genome Research 14:1679-1685 Influence of sampling strategy Population sampling Individual sampling – geographic spread
What is the genetic support for discontinuities between populations? • Serre & Pääbo (2004) • found gradients extending across the world • dispute major discontinuities between continental clusters of populations (often called races) • Explain apparent discontinuities (often attributed to race) as an artifact of population sampling • Rosenberg et al (2005) reply • Affirm discontinuities as real and due to small jumps in genetic distance for population pairs either side of geographic barriers (but not to be attributed to race). Rosenberg et al. (2005) Plos Genetics 1(6) e70
So what is the anxiety here? • Individuals differ in genetic susceptibilities and in response to pharmacological treatments • Is it worth trying to predict these differences for individuals based on their population membership or racial identity? • Opinions are divided. • Even if there are shortcuts to prediction for population membership, are they worth anything? Could they be worse than useless? • Answers here depend on population concepts and the underlying assumptions including those for reconstruction of ancestral (racial) populations.
Implications for SNP distributions across populations • Summary so far: Population differences are minor and based on small allele frequency differences accumulated over many loci. • Inference 1: Population differences in complex traits, eg hypertension and skin colour, are likewise based on small allele frequency differences accumulated over many loci • Inference 2: Evolution on recent time scales (last ~20,000 years) is key to understanding distributions of polymorphism underlying complex traits within and among contemporary populations. • Inference 3: The long time scale (tens to hundreds of millenia) matters for individual differences, but its relevance for population differences is open to dispute, and perhaps negligible. • Conclusion: Analyses across indigenous populations within an evolutionary framework is valuable for understanding polymorphism in studies useful to medical genetics – but use a geographic sampling strategy!
A case study using the CEPH-HGDP samples Young et al. (2005) Differential susceptibility to hypertension is due to selection during the out-of-Africa expansion. Plos Genetics 1(6):e82 Fig 2: Heat adaptation is strongly associated with absolute latitude, temperature and precipitation among the 53 populations of the CEPH-HGDT samples. Geographic/ecological variables explain worldwide variation in heat adaptation alleles at 5 genes involved in blood pressure regulation – the frequency of one of these alleles accounts for a major portion of the worldwide variation in blood pressure.
Conclusions for medical genetics • Question: Why should we expect major susceptibility SNP allele frequencies to vary substantially between populations? • Answer: Not for any reason of ancient population (racial) divergence. A primary and interesting reason for elevated allele frequencies will be selection along geographic gradients. • The value of indigenous populations for medical genetics is seen in studies of geographical variation designed to detect variable selective pressures.
Acknowledgements • All the authors of publications cited in this presentation • Reardon J. Race to the Finish. Identity and Governance in an Age of Genomics. 2005: Princeton University Press • A book I can recommend on the history and politics of the HGDP. • Clegg JB and Weatherall DJ (1999) Thalassemia and malaria: new insights into an old problem. Proceedings of the Association of American Physicians 111(4):278-282. • A reference for the malaria example • A pdf file of this talk is available from my webpage http://www.stats.ox.ac.uk/~harding/research.htm