780 likes | 871 Views
Human Evolution: Searching for Selection. Andrew Shah Algorithms in Biology 374 Spring 2008. Overview. Given a DNA sequences how do we know when natural selection has occurred? Different methods of answering this question How does having the entire genome available change this?.
E N D
Human Evolution:Searching for Selection Andrew Shah Algorithms in Biology 374 Spring 2008
Overview • Given a DNA sequences how do we know when natural selection has occurred? • Different methods of answering this question • How does having the entire genome available change this?
Natural Selection Introduction
Natural Selection Introduction
Natural Selection Introduction
Natural Selection • What sort of artifacts would this leave within the genome? Introduction
Natural Selection • The frequency of the long gene increases from one generation to the next. • It eventually reaches 100%, or fixation. Introduction
Natural SelectionGene Perspective • Same process at the gene level • Let the yellow dot represent the advantageous allele • It begins at a small frequency (.125 in this case) Introduction
Natural SelectionGene Perspective • During selection • The allele has risen in frequency! • Because of linkage, the nearby alleles have also risen in frequency Introduction
Natural SelectionGene Perspective • The allele has reached fixation! • As time goes on the nearby genes will slowly begin to reach fixation as well • Diversity has been lost Introduction
Natural SelectionGene Perspective • Effect of Selection on the Genome • Next Challenge: How did this effect differ from non-selection? Introduction
Neutral Theory (N.T.) • Problem: Need to distinguish natural selection • Therefore: Need a null hypothesis • Solution: Create model that approximates neutral evolution Kimura, 1960s Introduction
N.T. & Genetic Drift • Most variation is neutral with respect to selection • Therefore most changes in frequency are due to genetic drift Introduction
N.T. & Genetic Drift • A neutral gene has an equal probability of increasing or decreasing in frequency in the next generation Introduction
N.T. & Mutation • New alleles are introduced a constant rate (at a particular point) • To think about: How will this help us search for selection? Introduction
N.T. & Mutation Introduction
N.T. & Mutation Introduction
N.T. & Mutation Introduction
N.T. & Recombination • Recombination occurs at a near-constant rate at a given position Introduction
Testing the N. T. • How would natural selection differ from these assumptions? Introduction
“Positive Natural Selection in the Human Lineage” P. C. Sabeti, S. F. Schaffner, B. Fry, J. Lohmueller, P. Varilly, Shamovsky, A. Palma, T. S. Mikkelsen, D. Altshuler, E. S. Lander
Testing for Selection • Review of current state of genomic selection • Five statistical tests which use divergence from neutral theory to test for selection • Ideas? • Functional Alteration, Decreased Diversity, High Derived Alleles, Population Differences, Long Haplotypes Sabeti et al.
I. Functional Alteration • Get a section of genome, and compare synonymous vs. non-synonymous mutations between two species • Definition of synonymous mutation Sabeti et al.
I. Functional Alteration Silent/ Synonymous Non-Synonymous Sabeti et al.
I. Functional Alteration • Long time scale, because it is an interspecies metric • Limited value--only finds ongoing or recurrent selection • Use a Ka/Ks statistical test, or McDonald-Kreitman Sabeti et al.
II. Decreased Diversity • Way of detecting a selective sweep • Requires you know ancestral gene, derived genes • A derived gene is one that is a descendent of the ancestral one-it can be inferred using comparison to others species Sabeti et al.
II. Decreased Diversity • The two small bars represent mutations. They are derived genes of the blue ancestor gene. Sabeti et al.
II. Decreased Diversity • After the selective sweep the frequency of the derived alleles has jumped vis-a-vis the ancestral gene Sabeti et al.
II. Decreased Diversity A real example: derived alleles in red Sabeti et al.
II. Decreased Diversity • Key idea: need to have ancestral genes present • The genes must not have reached fixation! • The pattern will be that of normal diversity of alleles but with skewed distribution of variation • Statistical Tests: Tajima’s D, Fu and Li’s D* Sabeti et al.
III. New Alleles(AKA High Frequency of Derived Alleles) • Another technique for detecting selective sweep • Gene ‘hitch-hiking’ • Limited diversity because of fixation • Key idea: low frequency of new genes, but high diversity of rare alleles Sabeti et al.
III. New Alleles(AKA High Frequency of Derived Alleles) • Gene has reached fixation • Low diversity in this region compared to other regions Sabeti et al.
III. New Alleles(AKA High Frequency of Derived Alleles) • Next mutations slowly increase the diversity • Because they are all new the frequency remains low Sabeti et al.
III. New Alleles(AKA High Frequency of Derived Alleles) • As more time progresses, any pre-selective sweep alleles die out, and diversity is replace by many derived alleles Sabeti et al.
III. New Alleles(AKA High Frequency of Derived Alleles) Real world example: Red dots indicate rare alleles Sabeti et al.
III. New Alleles(AKA High Frequency of Derived Alleles) • Key Idea: The genes will have reached fixation and decreased diversity • The diversity will all be in the form of rare alleles (because they are new) • Statistical Test: Fay and Wu’s H Sabeti et al.
Comparing Methods • The difference between decreased diversity and increased frequency of new alleles? Vs. Sabeti et al.
IV. Population Differences • Requires population split • Disproportionate shift in gene frequencies • Limited utility Sabeti et al.
IV. Population Differences Sabeti et al.
IV. Population Differences Tall Tree Island Sabeti et al.
IV. Population Differences Sabeti et al.
IV. Population Differences • Two separated populations--specific gene will show disproportionate shift in frequency with respect to the other genes • Limited to cases where there are two populations • Statistical Test: F(st), P(excess) Sabeti et al.
V. Long Haplotypes • Based on Linkage Disequilibria (LD) • Long Haploblock and high frequency Sabeti et al.
V. Long Haplotypes • Under neutral conditions, a new allele has low frequency and high linkage disequilibrium Sabeti et al.
V. Long Haplotypes • As time goes on and the neutral allele increases in frequency recombination erodes the L.D. Sabeti et al.
V. Long Haplotypes Sabeti et al.
Genome-Wide Scanning • Better estimation of background rate • Helps to confirm previous studies • Suggests future areas of research • MORE POWER Sabeti et al.
Genome-Wide Scanning • SNP: Single Nucleotide Polymorphisms (excludes other types of mutations) that occur at > 1% frequency • SNPs are the basis of many genome wide analyses Sabeti et al.
“Forces Shaping the Fastest Evolving Regions in the Human Genome” K. S. Pollard, S. R. Salama, B. King, A. D. Kern, T. Dreszer, S. Katzman, A. Siepel, J. S. Pedersen, G. Bejerano, R. Baertsch, K. R. Rosenbloom, J. Kent, D. Haussler
Background • Exploits the very recent sequencing of the chimp and human genome • Uses the rate of allele replacement as test for selection • Assumption is that highly changing parts of the genome have been under selective pressure Pollard et al.