220 likes | 361 Views
Comp 790-087. Genetics, Evolution, and the Coalescent Theory. Administrative Details Course Overview Coalescent Theory?. Course Overview. Synopsis Graduate-level project course Guided reading, discussions, and project with write-up Website
E N D
Comp 790-087 Genetics, Evolution, andthe Coalescent Theory • Administrative Details • Course Overview • Coalescent Theory? Comp 790– Introduction & Coalescence
Course Overview • Synopsis • Graduate-level project course • Guided reading, discussions, and project with write-up • Website • To appear at: http://www.unc.edu/courses/2009spring/comp/790/087/ • Course Grading • Class Participation 10% • 2 In-class Presentations 40% • Final Project, Presentation, & Write-up 50% Comp 790– Introduction & Coalescence
Syllabus • ⅓ Guided reading/discussion of text • ⅓ Student presentations of recent papers • ⅓ Project Proposals Comp 790– Introduction & Coalescence
Coalescent Theory • Ancestral properties can be inferred from extant populations • Alternatives to Correctness • Most-Likely • Most Parsimonious • Other optimality criteria • Background • Biology (genetics) • Statistics • Computational Modeling Comp 790– Introduction & Coalescence
Non-Classical Genetics • Coalescence differs from classical genetics • Analysis rather than synthesis • Depends on models, which attempt to explain observations • Less emphasis on Darwin’s natural selection • Considers population dynamics • Isolation • Bottlenecks Comp 790– Introduction & Coalescence
Historical Human Migrations Comp 790– Introduction & Coalescence
Population Dynamics • It is helpful to view evolutionary trees in the contexts of geography and population structure • These factors affect the prevalence and distribution of genes • Genetic diversity largely depends on population isolation and population bottlenecks, as well as • Constant population size (resource limited) • Sudden increases in population (explosions) • Patterns of growth (exponential, uniform, etc.) Comp 665 – Introduction & Signals
It’s About Genes • Genetics is most clearly understood by considering its subject to be genes rather than organisms • Organisms are merely vessels for assuring the survival of genes • Successful genes live on long after their host organism • An objective of a gene is to replicate itself “[Genes] that survived were the ones that built survival machines for themselves to live in. But making a living got steadily harder as new rivals arose with better and more effective survivial machines. Survival machines got bigger and more eloborate, and the process was cumulative and progressive…” -- Dawkins, The Selfish Gene Comp 790– Introduction & Coalescence
Why Computer Science • Classically, genetics, both generative (classical) and coalescent (population) has focused on mathematical/statistical models • As model complexity increases, it becomes harder to find closed-form solutions • Relies more and more on computational modeling to ascertain structure • Also, complicated models often lead to common models… today’s subject Comp 790– Introduction & Coalescence
Wright-Fisher Model • One of the first, and simplest models of population genealogies was introduced by Wright (1931) and Fisher (1930). • Model emphasizes transmission of genes from one generation to the next • For simplicity we’ll first focus on a fixed population size, each with a distinct gene variant Comp 790– Introduction & Coalescence
Simple Haploid Model • Rules • Antecedent genes are chosen randomly, with replacement, from their parental generation • No selection • Fixed population size G0: ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'] G1: ['J', 'A', 'H', 'B', 'I', 'E', 'D', 'G', 'A', 'B'] G2: ['A', 'J', 'E', 'G', 'D', 'E', 'B', 'I', 'A', 'A'] G3: ['A', 'A', 'E', 'J', 'I', 'A', 'I', 'A', 'J', 'B'] G4: ['E', 'A', 'B', 'B', 'A', 'E', 'A', 'A', 'A', 'A'] G5: ['A', 'A', 'B', 'A', 'A', 'E', 'A', 'A', 'A', 'B'] G6: ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'B', 'A', 'A'] G7: ['B', 'A', 'A', 'B', 'A', 'A', 'A', 'A', 'A', 'A'] What will this population eventually look like? Comp 790– Introduction & Coalescence
Assumptions of Wright/Fisher • Discrete and non-overlapping generations • Haploid individuals • Populations size is constant • All individuals are equally fit • No population of social structure • Genes segregate independently Comp 790– Introduction & Coalescence
Some Graphical Abstractions • Replace letters with colors • Draw lineages • Sort topologically Comp 790– Introduction & Coalescence
Repeats Every population results in just one gene Comp 790– Introduction & Coalescence
Onset of Uniformity • 10000 trials • Mode = 11 (616) • Mean = 17.5 Comp 790– Introduction & Coalescence
Diploid Model • Our model is obviously too simple, let’s add more realism • Organisms are diploid (have 2, perhaps different, copies of each gene) • Sexual reproduction • Half female, Half Male Females Males ['AA', 'BB', 'CC', 'DD', 'EE', 'FF', 'GG', 'HH', 'II', 'JJ'] ['CF', 'AG', 'BI', 'CH', 'CH', 'EG', 'EG', 'EH', 'EI', 'DG'] ['CE', 'IE', 'CG', 'HE', 'HE', 'FG', 'IG', 'BE', 'BG', 'HE'] ['CB', 'EI', 'HE', 'HI', 'HE', 'CG', 'EF', 'EI', 'HF', 'GI'] ['EE', 'HF', 'BC', 'IG', 'BH', 'HI', 'BI', 'EF', 'HG', 'HC'] ['CH', 'BH', 'BH', 'EI', 'BH', 'BE', 'BI', 'HC', 'EH', 'IH'] ['BE', 'BE', 'IB', 'BE', 'BH', 'BH', 'BH', 'HH', 'CB', 'EH'] ['BH', 'EC', 'BB', 'BB', 'IC', 'BH', 'EH', 'EH', 'BE', 'EB'] ['CE', 'BE', 'BE', 'BE', 'CH', 'BE', 'CB', 'HE', 'IB', 'CB'] ['EH', 'BB', 'BB', 'BH', 'EC', 'EE', 'EC', 'CE', 'CI', 'CE'] ['CI', 'BE', 'EC', 'BE', 'BC', 'BC', 'BE', 'BE', 'BE', 'BC'] ['EB', 'EB', 'BB', 'CB', 'EB', 'CC', 'IB', 'BC', 'IE', 'CB'] ['BB', 'EC', 'BE', 'BC', 'CC', 'CC', 'EI', 'BI', 'BC', 'BC'] ['EB', 'CB', 'CC', 'BC', 'BB', 'CC', 'BI', 'CC', 'BC', 'CC'] ['BC', 'BC', 'CC', 'BC', 'BC', 'EC', 'BB', 'BC', 'BC', 'BB'] ['BB', 'BC', 'CB', 'CC', 'CB', 'CC', 'CB', 'CB', 'CE', 'CB'] ['CC', 'CB', 'CC', 'CB', 'BB', 'CC', 'CB', 'CC', 'CC', 'BC'] ['CC', 'CC', 'CC', 'BB', 'CC', 'BC', 'CC', 'CC', 'CC', 'CB'] ['CC', 'BC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CB', 'CC', 'CC'] ['CC', 'BC', 'BC', 'CC', 'CC', 'CC', 'CC', 'CC', 'BC', 'CC'] ['CC', 'CC', 'CC', 'BC', 'CC', 'CC', 'CC', 'CC', 'BC', 'CB'] ['CC', 'BC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC'] ['CC', 'CC', 'CC', 'CC', 'CC', 'BC', 'CC', 'CC', 'CC', 'CC'] ['CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC'] Comp 790– Introduction & Coalescence
Same Result Females Males Females Males ['AA', 'BB', 'CC', 'DD', 'EE', 'FF', 'GG', 'HH', 'II', 'JJ'] ['AG', 'BF', 'AI', 'EJ', 'DH', 'AH', 'CH', 'AI', 'DI', 'AH'] ['BI', 'AI', 'FA', 'AC', 'GH', 'AH', 'BI', 'DH', 'FI', 'AH'] ['HH', 'AI', 'BH', 'II', 'AI', 'AI', 'GI', 'AD', 'IA', 'AA'] ['HG', 'IG', 'ID', 'IA', 'ID', 'HG', 'HI', 'AA', 'AI', 'IA'] ['IA', 'DG', 'DH', 'HA', 'GA', 'DI', 'GH', 'GG', 'II', 'IH'] ['HI', 'GI', 'GH', 'DH', 'GG', 'HH', 'II', 'HI', 'ID', 'GH'] ['HH', 'GI', 'HI', 'HI', 'HH', 'IH', 'HH', 'HI', 'GG', 'HH'] ['GH', 'HH', 'HH', 'HH', 'IH', 'II', 'HH', 'HI', 'HG', 'HH'] ['HI', 'HG', 'HH', 'HH', 'IH', 'HH', 'GH', 'HH', 'HI', 'HH'] ['HH', 'HH', 'HH', 'HI', 'HG', 'HH', 'HH', 'HH', 'HH', 'IH'] ['HH', 'HH', 'HH', 'HH', 'HH', 'HH', 'IH', 'GH', 'HH', 'GH'] ['HI', 'HH', 'HG', 'HG', 'HH', 'HI', 'HG', 'HG', 'HH', 'HH'] ['GG', 'GH', 'HI', 'HH', 'IG', 'GH', 'IG', 'HG', 'HH', 'HH'] ['IH', 'HH', 'HG', 'HI', 'GH', 'HH', 'GI', 'IG', 'HH', 'GH'] ['IH', 'HG', 'HG', 'IH', 'HH', 'HG', 'II', 'HH', 'HG', 'IH'] ['II', 'HI', 'HI', 'HH', 'HH', 'HH', 'IH', 'HH', 'GI', 'HH'] ['IH', 'II', 'HH', 'HH', 'II', 'HH', 'HH', 'IH', 'HH', 'HH'] ['HH', 'HH', 'HH', 'II', 'IH', 'IH', 'IH', 'HH', 'IH', 'IH'] ['IH', 'HH', 'IH', 'II', 'IH', 'HH', 'HH', 'II', 'IH', 'IH'] ['HI', 'II', 'IH', 'IH', 'IH', 'HH', 'HI', 'IH', 'HI', 'HH'] ['IH', 'HH', 'HI', 'HI', 'HH', 'HI', 'HH', 'II', 'IH', 'HH'] ['II', 'HH', 'HH', 'HH', 'HH', 'HH', 'HI', 'HH', 'HH', 'II'] ['IH', 'HI', 'HI', 'HH', 'HI', 'HH', 'HH', 'HH', 'IH', 'HI'] ['IH', 'HH', 'HI', 'IH', 'HH', 'HH', 'HI', 'HI', 'HI', 'HH'] ['HI', 'HH', 'HH', 'HH', 'HH', 'HH', 'HI', 'HI', 'HH', 'HH'] ['HH', 'HH', 'IH', 'HH', 'HI', 'HH', 'HH', 'HH', 'HH', 'HH'] ['HH', 'HH', 'HH', 'IH', 'HH', 'HH', 'HH', 'HH', 'HH', 'HH'] ['HH', 'HH', 'HH', 'HH', 'HH', 'HH', 'HH', 'HH', 'HH', 'HH'] ['AA', 'BB', 'CC', 'DD', 'EE', 'FF', 'GG', 'HH', 'II', 'JJ'] ['EF', 'EI', 'DI', 'BF', 'EJ', 'BI', 'AI', 'CI', 'EJ', 'EI'] ['FB', 'EE', 'EI', 'JB', 'FE', 'BJ', 'JE', 'EI', 'EE', 'DI'] ['FJ', 'IE', 'FE', 'JE', 'EJ', 'FI', 'FE', 'BJ', 'EB', 'FJ'] ['FF', 'II', 'EB', 'FI', 'EI', 'EJ', 'EB', 'IB', 'EI', 'EF'] ['FE', 'IE', 'BE', 'BB', 'BE', 'EB', 'EB', 'IE', 'BI', 'FJ'] ['BB', 'FE', 'FE', 'EE', 'EE', 'BJ', 'II', 'EE', 'BE', 'FE'] ['BB', 'EB', 'EB', 'EE', 'EI', 'EE', 'FI', 'EE', 'EE', 'EE'] ['EF', 'BE', 'BE', 'EE', 'BE', 'IE', 'BI', 'EE', 'EE', 'EE'] ['EE', 'BE', 'FE', 'EB', 'EE', 'BE', 'BE', 'BE', 'FE', 'EE'] ['EF', 'EE', 'EE', 'EE', 'EE', 'EE', 'FE', 'EB', 'BE', 'EE'] ['EE', 'FB', 'EE', 'EE', 'EE', 'EB', 'EE', 'EE', 'EE', 'EE'] ['EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'BE', 'EE', 'EE'] ['EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'EE'] ['AA', 'BB', 'CC', 'DD', 'EE', 'FF', 'GG', 'HH', 'II', 'JJ'] ['AF', 'AJ', 'AF', 'CG', 'BG', 'BH', 'AH', 'EG', 'DG', 'DJ'] ['AA', 'AH', 'CA', 'AJ', 'BJ', 'GB', 'AB', 'GH', 'BB', 'AA'] ['JH', 'AB', 'AA', 'AG', 'AB', 'JB', 'AH', 'AA', 'AA', 'AG'] ['GA', 'AA', 'AG', 'BH', 'AB', 'BA', 'GA', 'AA', 'AA', 'AA'] ['AA', 'AA', 'AA', 'AA', 'AG', 'AA', 'AA', 'AA', 'AB', 'AA'] ['AA', 'AA', 'AB', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA'] ['BA', 'AA', 'AA', 'AA', 'AA', 'BA', 'AA', 'AA', 'AA', 'AA'] ['AB', 'AA', 'AA', 'AB', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA'] ['AA', 'AA', 'BA', 'AA', 'BA', 'AA', 'AA', 'AA', 'AA', 'AA'] ['AA', 'AA', 'AA', 'BA', 'BA', 'AA', 'AA', 'AA', 'AA', 'AA'] ['AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'BA', 'AA', 'BA', 'AA'] ['AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AB', 'AB', 'AA'] ['AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA'] Females Males Comp 790– Introduction & Coalescence
Similar Distributions • 10000 trials • Mode = 24 (291) • Mean = 35.5 • These statistics are almost exactly 2x the haploid (Mode = 11, Mean = 17.5) Comp 790– Introduction & Coalescence
Punnett Squares What if we introduce inbreeding, by choosing mates from common litters, in successive generations Comp 790– Introduction & Coalescence
Different Problem? Parents Offspring Parents Offspring ('A', 'A') ('B', 'B') [('A', 'B'), ('A', 'B'), ('A', 'B'), ('A', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'B'), ('A', 'B')] ('A', 'A') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'A'), ('A', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('B', 'B') [('A', 'B'), ('A', 'B'), ('B', 'B'), ('B', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('B', 'B') [('A', 'B'), ('A', 'B'), ('B', 'B'), ('B', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'A') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'A'), ('A', 'B')] ('A', 'B') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'B'), ('A', 'B')] ('A', 'B') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'B'), ('A', 'B')] ('A', 'A') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'A'), ('A', 'B')] ('A', 'A') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'A'), ('A', 'B')] ('A', 'A') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'A'), ('A', 'B')] ('A', 'A') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'A'), ('A', 'B')] ('A', 'A') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'A'), ('A', 'B')] ('A', 'A') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'A'), ('A', 'A')] ('A', 'A') ('B', 'B') [('A', 'B'), ('A', 'B'), ('A', 'B'), ('A', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'B'), ('A', 'B')] ('A', 'B') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'B'), ('A', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('B', 'B') [('A', 'B'), ('A', 'B'), ('B', 'B'), ('B', 'B')] ('B', 'B') ('B', 'B') [('B', 'B'), ('B', 'B'), ('B', 'B'), ('B', 'B')] ('A', 'A') ('B', 'B') [('A', 'B'), ('A', 'B'), ('A', 'B'), ('A', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'A') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'A'), ('A', 'A')] Parents Offspring Comp 790– Introduction & Coalescence
Distributions Comp 790– Introduction & Coalescence
Next time • Commonly occurring distributions • Geometric • Exponential Comp 790– Introduction & Coalescence