260 likes | 350 Views
ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG. 010101100010010100001010101010011011100110001100101000100101. Introduction: Human Population Genomics. How soon will we all be sequenced?. Cost Killer apps Roadblocks?. Applications. Cost. Time. 2013? 2018?.
E N D
ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG 010101100010010100001010101010011011100110001100101000100101 Introduction: Human Population Genomics
How soon will we all be sequenced? • Cost • Killer apps • Roadblocks? Applications Cost Time 2013? 2018?
Human population migrations • Out of Africa, Replacement • Single mother of all humans (Eve) ~150,000yr • Single father of all humans (Adam) ~70,000yr • Humans out of Africa ~50000 years ago replaced others (e.g., Neandertals) • Multiregional Evolution • Generally debunked, however, • ~5% of human genome in Europeans, Asians is Neanderthal, Denisova
Coalescence Y-chromosome coalescence
Why humans are so similar A small population that interbred reduced the genetic variation Out of Africa ~ 50,000 years ago Out of Africa
Migration of Humans http://info.med.yale.edu/genetics/kkidd/point.html
Migration of Humans http://info.med.yale.edu/genetics/kkidd/point.html
Some Key Definitions Mary: AGCCCGTACG John: AGCCCGTACG Josh: AGCCCGTACG Kate: AGCCCGTACG Pete: AGCCCGTACG Anne: AGCCCGTACG Mimi: AGCCCGTACG Mike: AGCCCTTACG Olga: AGCCCTTACG Tony: AGCCCTTACG G/G G/G G/T G/G G/G G/G G/G T/T T/G T/G Mom Dad Recombinations: At least 1/chromosome On average ~1/100 Mb • Heterozygosity: • Prob[2 alleles picked at random with replacement are different] • 2*.75*.25 = .375 • H = 4Nu/(1+4Nu) Alleles: G, T Major Allele: G Minor Allele: T Linkage Disequilibrium: The degree of correlation between two SNP locations
Human Genome Variation TGCTGAGA TGCCGAGA TGCTCGGAGA TGC - - - GAGA SNP Novel Sequence Mobile Element or Pseudogene Insertion Inversion Translocation Tandem Duplication TGC - - AGA TGCCGAGA Microdeletion Transposition TGC Novel Sequence at Breakpoint Large Deletion
The Fall in Heterozygosity H – HPOP FST= ------------- H
The HapMap Project ASW African ancestry in Southwest USA 90 CEU Northern and Western Europeans (Utah) 180 CHB Han Chinese in Beijing, China 90 CHD Chinese in Metropolitan Denver 100 GIH Gujarati Indians in Houston, Texas 100 JPT Japanese in Tokyo, Japan 91 LWK Luhyain Webuye, Kenya 100 MXL Mexican ancestry in Los Angeles 90 MKK Maasaiin Kinyawa, Kenya 180 TSI Toscaniin Italia 100 YRI Yoruba in Ibadan, Nigeria 100 Genotyping: Probe a limited number (~1M) of known highly variable positions of the human genome
Linkage Disequilibrium & Haplotype Blocks Minor allele: A G pA pG Linkage Disequilibrium (LD): D = P(A and G) - pApG
Population Sequencing – 1000 Genomes Project The 1000 Genomes Project Consortium et al.Nature467, 1061-1173 (2010) doi:10.1038/nature09534
Association Studies Control A/G A/G G/G G/G A/G G/G G/G Disease A/A A/G A/A A/G A/G A/A A/A p-value
Wellcome Trust Case Control Many associations of small effect sizes (<1.5) Nature 464, 713-720(1 April 2010) Nature 447, 661-678(7 June 2007)
Disease Clustering PLoS Genet 5(12): e1000792. doi:10.1371/journal.pgen.1000792. 2009.
Disease Clustering • RA vs. ATD • RA vs. MS • No recorded co-occurrence of RA and MS
Ancestry Inference Danish ? French Spanish Mexican
Fixation, Positive & Negative Selection How can we detect negative selection? How can we detect positive selection? Negative Selection Neutral Drift Positive Selection
Conservation and Human SNPs Neutral CNS CNSs have fewer SNPs SNPs have shifted allele frequency spectra
How can we detect positive selection? Ka/Ks ratio: Ratio of nonsynonymous to synonymous substitutions Very old, persistent, strong positive selection for a protein that keeps adapting Examples: immune response, spermatogenesis
Long Haplotypes –iHS test • Less time: • Fewer mutations • Fewer recombinations