480 likes | 760 Views
Fitness landscapes. Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville. Table of contents. General notion of fitness landscapes Fitness landscapes in simple population genetic models Rugged landscapes Single-peak landscapes
E N D
Fitness landscapes Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville
Table of contents • General notion of fitness landscapes • Fitness landscapes in simple population genetic models • Rugged landscapes • Single-peak landscapes • Flat landscapes • Holey landscapes
Sewall Wright (1889-1988) • A founder of theoretical population genetics (with Fisher and Haldane) • Introduced the notion of “fitness landscapes” (a.k.a. adaptive landscapes, adaptive topographies, surfaces of selective values) in 1931 • His last publication on fitness landscapes was published in 1988
Papers on fitness landscapes Some of the journals that publish these papers: JOURNAL OF THEORETICAL BIOLOGY, PROTEIN ENGINEERING, PHYSICAL REVIEW E, CANCER RESEARCH, EVOLUTION , JOURNAL OF MATHEMATICAL BIOLOGY, LECTURE NOTES IN COMPUTER SCIENCE, CURRENT OPINION IN BIOTECHNOLOGY, MARINE ECOLOGY-PROGRESS SERIES, INTEGRATED COMPUTER-AIDED ENGINEERING, PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, BIOLOGY & PHILOSOPHY, INTERNATIONAL JOURNAL OF TECHNOLOGY MANAGEMENT, BIOSYSTEMS , JOURNAL OF GENERAL VIROLOGY , ECOLOGY LETTERS, RESEARCH POLICY , SYSTEMS RESEARCH AND BEHAVIORAL SCIENCE, ANNALS OF APPLIED PROBABILITY, BIOPOLYMERS
Working example: one-locus two-allele model of viability selection • Two allele at a single locus: A and a • Allele frequencies: p and 1-p • Three diploid genotypes: AA, Aa and aa • Genotype frequencies: • Viabilities: • Average fitness of the population:
L=2,A=3 case Dimensionality: D=L(A-1) for haploids and D=2L(A-1) for diploids
Dimensionality of the population state space • General case: • Randomly mating population under constant viability selection:
Average fitness of the population in a 2-locus 2-allele model with additive fitnesses D=2 (because of linkage equilibrium)
Fitness landscapes for mating pairs: mating preference Drosophila silvestris, D.heteroneura and hybrids
Fitness landscapes for quantitative characters • Relationship between a set of Q quantitative characters that an individual has and its fitness; dimensionality of phenotype space is Q • Relationship between the average fitness of the population and its genetic structure; dimensionality is equal to the number of phenotypic moments affecting the average fitness
Average fitness of the population under stabilizing selection
Metaphor of fitness landscapes • Two or three dimensional visualization of certain features of multidimensional fitness landscapes [Wright 1932]
Hill climbing on a rugged fitness landscape (Kauffman and Levin 1987) • L diallelic haploid loci • Fitnesses are assigned randomly • The walk starts on a randomly chosen genotype • At each time step, the walk samples one of the L one-step neighbors. If the neighbor has higher fitness, the walk moves there. Otherwise, no change happens. The walk stops when it reaches a local fitness peak, so that all L neighbors have smaller fitness
Sample of Kauffman and Levin’s results • Expected number of local peaks is • Expected fraction of fitter neighbors dwindles by ½ on each improvement step • Average number of steps till a local peak is • Ratio of accepted to tried mutations scales as • From most starting points, a walk can climb only to an extremely small fraction of the local peaks. Any one local peak can be reached only from an extremely small fraction of starting points. • “Complexity catastrophe”: as L increases, the heights of accessible peaks fall towards the average fitness
Single-peak fitness landscape Ronald Fisher (1890-1962)
Fisher’s geometric model of adaptation • Each organism is characterized by Q continuous variables • There is a single optimum phenotype O and fitness decreases monotonically with increasing (Euclidean) distance from the optimum • Let d/2 be the current distance to the optimum O • Each mutation is advantageous if it moves the organism closer to O. • Let r be the mutation size (i.e. distance between the current state and the mutant)
For large Q, the probability that a mutation is advantageous is P(r)=1-F(r) where F is the cumulative distribution function of a standard normal distribution, and Mutations of small size are the most important in evolution
Corrections to the Fisher model • Kimura (1983): the probability that an advantageous mutation with effect s is fixed is 2s. Therefore, the rate of adaptive substitutions is 2x(1-F(x)). Thus, mutation of intermediate size are most important. • Orr (1998): distance to the optimum continuously decreases. The distribution of factors fixed during adaptation is exponential.
“Error threshold” (Manfred Eigen) • Assume that there is a single optimum genotype (“master sequence”) that has fitness 1; all other genotypes have fitness 1-s. Let n be the mutation rate per sequence per generation • Then, if n<s, then the equilibrium frequency of the master sequence is 1-n/s. • If n>s, the master sequence is not maintained in the population
Flat fitness landscape (of the neutral theory of molecular evolution) Motoo Kimura (1924-1994)
Evolution of flat landscapes • Random walk on a hypercube • Equilibrium distribution: equal probability to be at any vertex; time to reach the equilibrium distribution is order steps • Transient dynamics of the distance to the initial state • The index of dispersion (i.e. var(x)/E(x), where x is the number of steps per unit of time) is equal to 1.
Evolution of flat landscapes (cont.) • In a population of N alleles, any two alleles can be traced back to a common ancestor about N generations ago (under the Fisher-Wright binomial scheme for random genetic drift) • The average number of mutations fixed per generation is equal to the mutation rate n • The average genetic distance between two organisms is 2Nn • Population can be clustered into 2(2Nn)/d clusters such that the average distance within the same cluster is d.
How many dimensions do real fitness landscapes have? • The world as we perceive it is three dimensional • Superstring theory: 10 to 12 dimensions are required to explain physical world • Biological evolution takes place in a space with millions dimensions (3/27/03)
Extremely high dimensionality ofthe genotype space results in: redundancy in the genotype-fitness map a possibility that high-fitness genotypes form networks that extend throughout the genotype space (=> substantial genetic divergence without going through adaptive valleys) increased importance of chance and contingency in evolutionary dynamics (=>mutational order as a major source of stochasticity)
Russian roulette model Genotype is viable with probability p and is inviable otherwise: There exists a giant cluster of viable genotypes if p>0.5973 (percolation in two dimensions)
Percolation on a hypercube Each genotype has L “neighbors.” In the L-dimensional hypercube (e.g. if there are L diallelic loci), viable genotypes form a percolating neutral network if p>1/L (assuming that L is very large).
Uniformly rugged landscape Fitness w is drawn from a distribution on (0,1): The nearly neutral network of genotypes with fitnesses between w1 and w2 percolates if w2-w1>1/L.
Metaphor of holey fitness landscapes disregards fitness differences between different genotypes belonging to the network of high-fitness genotypes and treats all other genotypes as holes Microevolution and local adaptation ~ climbing from a “hole” macroevolution ~ movement along the holey landscape speciation takes place when populations come to be on opposite sides of a "hole" in the landscape
Formal models Nei (1976) Wills (1977) Nei et al (1983) Bengtsson and Christiansen (1983) Bengtsson (1985) Barton and Bengtsson (1986) The origin of the idea • Verbal arguments • Bateson (1909) • Dobzhansky (1937) • Muller (1940, 1942) • Maynard Smith (1970, 1983) • Nei (1976) • Barton and Charleswoth (1984) • Kondrashov and Mina (1986)
Dobzhansky model (1937) (1900-1975) “This scheme may appear fanciful, but it is worth considering further since it is supported by some well-established facts and contradicted by none.” (Dobzhansky, 1937, p.282)
Maynard Smith (1970): “It follows that if evolution by natural selection is to occur, functional proteins must form a continuous network which can be traversed by unit mutational steps without passing through nonfunctional intermediates” (p.564)
Terminology • A neutral network is a contiguous set of genotypes (sequences) possessing the same fitness. • A nearly neutral network is a contiguous set of genotypes possessing approximately the same fitness. • A holey fitness landscape is a fitness landscape in which relatively infrequent high-fitness genotypes form a contiguous set that expands throughout the genotype space.
Conclusions from models The existence of percolating nearly-neutral networks of high-fitness combinations of genes which allow for “nearly-neutral” divergence is a general property of fitness landscapes with a very large number of dimensions.
Experimental evidence • Direct analyses of relationships between genotype and fitness in plants, Drosophila, mammals and moths • Ring species and hybrid zones • Artificial selection experiments • Natural hybridization in plants and animals • Intermediate forms in the fossil record • Properties of RNA and proteins • Patterns of molecular evolution • Artificial life
Applications • Speciation • Hybrid zones • Morphological macroevolution • RNA and proteins • Adaptation • Molecular evolution • Gene and genome duplication • Canalization of development