1 / 19

Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics

Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics. John Witte. Coding Genotypes. Post-Genomic Era: Lots of Data!. “The study of genetic and other biological information using computer and statistical techniques.” A Genome Glossary, Science, Feb 16, 2001.

hunter
Download Presentation

Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Epidemiology 217Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte

  2. Coding Genotypes

  3. Post-Genomic Era: Lots of Data!

  4. “The study of genetic and other biological information using computer and statistical techniques.” A Genome Glossary, Science, Feb 16, 2001

  5. Bioinformatics in Genetic Epi Some key aspects: • Data management • Candidate regions / genes (selection and SNP mining) • Genetic Analyses (e.g., genotyping) • Statistical Analyses

  6. Laboratory Database Demogr. Database Clinical Database 5/20 Genomic Database Health and Habits Database Nutritional Database CaP Genes Databases Hub Data Management

  7. From gene to polymorphisms Given a gene, how do I… Find its polymorphisms? Find information about those polymorphisms?

  8. Hands-on guide for browsing and analyzing genomic data. Contains worked examples, providing: overview of the types of data available, details on how these data can be browsed, and step-by-step instructions for using many of the most commonly-used tools for sequence based discovery. www.nature.com/cgi-taf/dynapage.taf?file=/ng/journal/v35/n1s/

  9. Nature Genetics: A User's Guide to the Human Genome 3 of the 13 worked example questions How does one find a gene of interest and determine that gene's structure? How would one retrieve the sequence of a gene, along with all annotated exons and introns, as well as a certain number of flanking bases for use in primer design? A user wishes to find all the single nucleotide polymorphisms that lie between two sequence-tagged sites. Do any of these single nucleotide polymorphisms fall within the coding region of a gene? Where can any additional information about the function of these genes be found?

  10. Look for SNPs in Databases • General databases: --- dbSNP (http://www.ncbi.nlm.nih.gov/) --- UCSC Genome Bioinformatics (http://genome.ucsc.edu/) --- HapMap (http://www.hapmap.org/) --- The SNP consortium (TSC) (http://snp.cshl.org/) --- Human gene variation base (HGVbase) (http://hgvbase.cgb.ki.se) • Special databases: --- The UW-FHCRC Variation Discovery Resource (SeattleSNPs) (http://pga.gs.washington.edu/) --- Cancer Genome Anatomy Project - SNP500Cancer Database (http://snp500cancer.nci.nih.gov/home_1.cfm) --- InnateImmunity (http://innateimmunity.net) --- Drug response (http://pharmgkb.org) • More….

  11. UCSC Browser Gene structure Comparative Genomics SNPs

  12. SeattleSNPs • Resequencing the complete genomic region of each gene among 24 African-American (AA) subjects and 23 European (CEPH) subjects • 2000 bp upstream of first exon • 1500 bp downstream of poly-A signal • All exons and introns for genes below 35 kbp • Summary data (2/18/05) • Number of genes sequenced: 208 • Total kilobases sequenced: 4408.78 • Number of SNPs found: 23,590 • SNPs in AA sample: 20,765 • SNPs in CEPH sample: 12,937

  13. From Genomics to Proteomics • Our ~ 25,000 genes carry the blueprint for making proteins, of which all living matter is made. • Each protein has a particular shape and function that determine its role in the body. • Proteomics is the study of protein shape, function, and patterns of expression.

  14. DNA 5` 3` Pre-splicing RNA Post-splicing RNA Protein Anatomy of a gene Exon, coding Promoter Exon, non-coding (5`UTR, 3`UTR) Enhancer Poly-adenilation Intron

  15. Proteomics • Characterize proteins derived from genetic code • Compare variations in their expression levels under different conditions • Study their interactions • Identify their functional role.

  16. Proteome Complexity • Recall that genome is relatively static. • In contrast, many cellular proteins are continually moving and undergoing changes such as: • binding to a cell membrane, • partnering with another protein, • gaining or losing a chemical group such as a sugar, fat, or phosphate, or • breaking into two or more pieces.

  17. Size of Proteome? • > 1 Million Proteins >>> 25,000 genes in humans. • Large number due to complexity (a given gene can make many different proteins) • Features such as folds and motifs, allow them to be categorized into groups and families. • This should help make it easier to undertake proteomic research. • But no proteome has yet been sequenced.

  18. How to Analyze Proteomes • Broad range of technologies • Central paradigm: • 2-D gel electrophoresis (2D-GE), and mass spectrometry (MS). • 2D-GE is used to separate the proteins by isoelectric point and then by size. • MS determines their identity and characteristics.

  19. Bioinformatics in Proteomics • Creation and maintenance of databases of protein info. • Development of methods to predict the structure and/or function of newly discovered proteins and structural RNA sequences. • Clustering protein sequences into families of related sequences and the development of protein models. • Aligning similar proteins and generating phylogenetic trees to examine evolutionary relationships

More Related