80 likes | 224 Views
Bioinformatics and Mathematical Genetics. Simon Myers myers@stats.ox.ac.uk. An example of a modern genetic dataset. We are gathering data in Oxford (in collaboration with Chicago colleagues) 10 chimpanzees Having their DNA sequence read across the ~3 billion base chimp genome
E N D
Bioinformatics and Mathematical Genetics Simon Myers myers@stats.ox.ac.uk
An example of a modern genetic dataset • We are gathering data in Oxford (in collaboration with Chicago colleagues) • 10 chimpanzees • Having their DNA sequence read across the ~3 billion base chimp genome • First time this has been done in chimps • Modern technology means this is an affordable project Q: Why are we doing this? A: Because of the insights it can yield into evolutionary processes, not just in chimps, but in humans
Data for a small region ~300,000,000,000 base pairs of sequence in total A project to sequence 1000 human genomes: ~20,000,000,000,000 base pairs
Opportunities (I) • Genome-wide data carries an immense amount of information • We can compare genomes between humans and chimps: • Chimpanzees and humans are 98.6% similar at the DNA level...but • What places do we differ? • Can we explain what makes us human? • Many places that differ are the result of random chance • We can look within the set of chimps • How are chimps evolving? • How do individual chimpanzees differ at the DNA level? • Lots of similar variation data has been or is being gathered in humans • Is the chimpanzee data similar, or different, in terms of overall patterns?
Opportunities (II) • Variation patterns that we see reflect evolutionary forces • Mutation, selection, recombination, migration,... • Do these forces work in similar ways in both species? • Are, e.g., similar types of gene under selection in humans and chimpanzees? • We can learn how dynamic, or conserved, we ought to think of these forces • A particular interest is recombination: • This does differ, strongly, between the species, at least at scales of thousands of bases • Is there any sharing? At what scales? • How does DNA sequence relate to recombination in chimps and humans? • Knowing the answer will lead directly to information about constraints on recombination • In turn, perhaps insights in humans: disease, infertility, and differences among populations Father Mother Child
Challenges ....as night follows day A lot of data • Computationally intensive • Need for careful algorithm construction Go from “raw” sequence to something useful: • Must align to compare species, dealing with errors in the data Understand how the forces we care about influence the data • Evolutionary modelling • Think about relationships among individuals in the sample • Development of inference techniques • Must be applicable to these large datasets The aim of this module is to give an introduction to: • Approaches to address these types of challenges • What we have already learnt using Bioinformatics and mathematical genetics
Brief overview • Mixture of lectures, practicals, exercises and some reading • Week 1:Genomes, genetic variation and evolutionary forces • Today: how and why do genomes evolve? • Later in the week: • Alignment of genomes, to e.g. discover genes • Phylogenetics: to build trees • Modelling evolution, to relate biological parameters and data • Exploring variation patterns in practice • Inference on biological parameters • Week 2:Phenotype and function • 3-day project, led by Jotun Hein, to identify, and analyse, the evolution of a unique functional element in our genome • This is also the basis of the assessment, by presentation • Relating variation among individuals to human phenotypes • What mutations cause disease, differences in metabolite levels....
Introductory practical The first practical explores the role of randomness in evolution Some mutations carry a benefit to those who carry them, or are deleterious • Individuals with beneficial mutations have, on average, more children on average • However, inheritance is highly stochastic • Luck is involved in successfully finding a mate, and raising children • Parents pass on a random subset of the mutations they carry to their children • Randomness and selection act in opposition • How does randomness affect selection? • How often do “neutral” mutations, that are neither beneficial nor deleterious, • succeed? • These questions can be explored with the Wright-Fisher model