1 / 44

Biology and Bioinformatics

BI820 – Seminar in Quantitative and Computational Problems in Genomics. Biology and Bioinformatics. Gabor T. Marth. Department of Biology, Boston College marth@bc.edu. The animal cell. DNA – the carrier of the genetic code. DNA organization – chromosomes. Translation of genetic information.

Download Presentation

Biology and Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BI820 – Seminar in Quantitative and Computational Problems in Genomics Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu

  2. The animal cell

  3. DNA – the carrier of the genetic code

  4. DNA organization – chromosomes

  5. Translation of genetic information

  6. DNA sequencing informatics DNA sequencing informatics

  7. DNA organization

  8. Genome annotation

  9. De novo gene prediction

  10. Similarity-based gene prediction

  11. Gene localization

  12. Genetic mapping

  13. Gene function

  14. Expression analysis

  15. Protein structure

  16. RNA structure

  17. Protein structure prediction

  18. RNA structure prediction

  19. DNA evolution

  20. Evolution of chromosome organization

  21. Evolution of gene structure

  22. Evolution of DNA sequence

  23. Comparative genomics

  24. Phylogenetics

  25. Mechanisms of molecular evolution

  26. sequence variations make our genetic makeup unique SNP • Single-nucleotide polymorphisms (SNPs) are most abundant, but other types of variations exist and are important Sequence variations • Human Genome Project produced a reference genome sequence that is 99.9% common to each human being

  27. inherited diseases demographic history Why do we care about variations? phenotypic differences

  28. diverse sequence resources can be used EST WGS BAC • diversion: sequencing informatics How do we find polymorphisms? • look at multiple sequences from the same genome region

  29. Sequence clustering Cluster refinement Multiple alignment SNP detection SNP discovery -- Methods

  30. SNP discovery – Computer tools

  31. 507,152 high-quality candidate SNPs (validation rate 83-96%) Marth et al., Nature Genetics 2001 SNP discovery – Mining Projects ~ 30,000 clones >CloneX ACGTTGCAACGT GTCAATGCTGCA >CloneY ACGTTGCAACGT GTCAATGCTGCA 25,901 clones (7,122 finished, 18,779 draft with basequality values) 21,020 clone overlaps (124,356 fragment overlaps) ACCTAGGAGACTGAACTTACTG ACCTAGGAGACCGAACTTACTG

  32. characterizing known polymorphic sites in sample collections – genotyping SNP databases and characteristics • access to variation data • SNP properties • reliability of information

  33. TAACAAT • mutations are propagated down through generations MRCA TAAAAAT TAAAAAT TAACAAT TAAAAAT TAAAAAT TAACAAT TAACAAT TAACAAT Where do variations come from? • sequence variations are the result of mutation events TAAAAAT

  34. MRCA MRCA accgctatgtaga accgttatgtaga accgctatataga actgttatgtaga Mutation rate • higher mutation rate (µ) gives rise to more SNPS

  35. accgttatgtaga accgttatgtaga accgttatgtaga accgttatgtaga accgttatgtaga accgttatgtaga accgttatgtaga accgttatgtaga accgttatgtaga accgttatgtaga Recombination accgttatgtaga accgttatgtaga accgttatgtaga

  36. large (effective) population size N Demographic history small (effective) population size N • different world populations have varying long-term effective population sizes (e.g. African N is larger than European)

  37. Modeling bottleneck stationary collapse expansion past history present MD (simulation) AFS (direct form)

  38. Ancestral inference modest but uninterrupted expansion bottleneck

  39. The signatures of selection • selective mutations influence the genealogy itself; in the case of neutral mutations the processes of mutation and genealogy are decoupled

  40. “haplotype blocks” Association and haplotype structure “linkage disequilibrium”

  41. Computer simulations: the Coalescent

  42. functional understanding Medical utility? ? clinical phenotype molecular markers

  43. association between allele and phenotype Mapping disease-causing loci genetic linkage

  44. Forensic applications

More Related