350 likes | 565 Views
Gil McVean. What makes us different?. Image: Wikimedia commons. The genetic axes. Strong. Genetic disorders. Cancer. Inherited. Somatic. Complex disease. Aging. Weak. Images:Wikimedia commons. Characterising individual genomes. Image: Wikimedia commons. Image: Wikimedia commons.
E N D
What makes us different? Image: Wikimedia commons
The genetic axes Strong Genetic disorders Cancer Inherited Somatic Complex disease Aging Weak Images:Wikimedia commons
Characterising individual genomes Image: Wikimedia commons Image: Wikimedia commons Image: Illumina Cambridge Ltd
Why 1000 genomes? • To find all common (>5%) variants in the accessible human genome • To find at least 95% of variants at 1% in populations of medical genetics interest • 95% of variants at 0.1% in genes • To provide a fully public framework for interpreting rare genetic variation in the context of disease • Screening • Imputation
Population sequencing Haplotypes 2x 10x
www.1000genomes.org http://browser.1000genomes.org
Good, but not perfect Post-hoc filtering Not genotyped
4 million sites that differ from the human reference genome 12,000 changes to proteins 100 changes that knockout gene function 5 rare variants that are known to cause disease
Most variation is common – Most common variation is cosmopolitan Number of variants in typical genome Found in all continents 92% Found only in Europe 0.3% Found only in the UK 0.1% Found only in you 0.002%
Imputation from 1000 Genomes • Imputation similar for all variant types across populations • Comparable to imputation from high quality SNP haplotypes
What have we learned about low-frequency genetic variation from the 1000 Genomes Project? • How many rare (<0.5%) and low-frequency (0.5-5%) variants are there, how does it vary between populations and what does it tell use about demography? • To what extent has natural selection shaped the distribution of rare variants within and between populations? • What are the implications of these findings for the interpretation of genetic variation in individual genomes?
Rare variant differentiation within ancestry groupings increases as variant frequency decreases
Rare variants identify recent historical links between populations 48% of IBS variants shared with American populations ASW shows stronger sharing with YRI than LWK
The proportion of rare variants is predicted by conservation, with the exception of splice-disrupting and STOP+ variants
Patterns of variation inform about selective constraint CTCF-binding motif
Variants under selection showed elevated levels of population differentiation Proportion of pairwise comparisons where nonsynonymous variants are more differentiated than synonymous ones
Rare variant differentiation can confound the genetic study of disease Mathieson and McVean (2012)
Implications • Rare variants have spatial and ancestry-related distributions that reflect recent demographic events and selection. • Purifying selection elevates local differentiation of rare variants. • The functional and aetiological interpretation of rare variants in the context of disease needs to be aware of the local genetic background.
The final resource – mid 2013 AFRICA Gambian in Western Division, The Gambia (GWD) Malawian in Blantyre, Malawi (MAB) Mende in Sierra Leone (MSL) Esan in Nigeria (ESN) SOUTH ASIAN Punjabi in Lahore, Pakistan (PJL) Bengali in Bangladesh (BEB) Sri Lankan Tamil in the UK (STU) Indian Telugu in the UK (ITU) AMERICAS African American in Jackson, MS (AJM) 100 200 100 100 100 100 80
What more could we learn about human population genetics? • There is a need for continuing the programme of developing public resources describing genetic variation across new populations, with high resolution spatial information. • This will not just shed light on population history and selection, but be important for interpreting (rare) genetic variation in individual genomes. • The Phase 1 1000 Genomes data has made clear the extent of variation in conserved regulatory sequence within genomes • How does this relate to variation in function in different cell types? • Many of the most interesting parts of the genome (for the study of selection) are still poorly-covered by HTS data • Need to collect ‘bespoke’ data types for some genomic regions
The 1000 Genomes Project Consortium http://www.1000genomes.org/