1 / 37

Introduction

Introduction. Day 1: Introduction Day 2: Sequence Analysis Day 3: Databases Day 3: Dynamic Programming mario@sanbi.ac.za. Goals of Bioinformatics. Understand living cells and how they function on a molecular level Done by analysing molecular sequence and structural data

thanh
Download Presentation

Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction • Day 1: Introduction • Day 2: Sequence Analysis • Day 3: Databases • Day 3: Dynamic Programming mario@sanbi.ac.za

  2. Goals of Bioinformatics • Understand living cells and how they function on a molecular level • Done by analysing molecular sequence and structural data • Rationale is the “central dogma” of biology

  3. Genomic Data (2009) http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome

  4. Genomic Data (2010) http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome

  5. Bioinformatics Limitations • Completely relying on the information is dangerous if the info is inaccurate • Quality of bioinformatics predictions depends on • quality of the data and • sophistication of the algorithms • Bioinformatics and experimental biology are complementary: • Bioinformatics results need to be consistent • with experimental biology

  6. Bioinformatics Limitations • Data (e.g. sequence, expression) may contain errors • Downstream interpretation of sequence date will be wrong if the sequences or the annotation thereof is wrong • Many algorithms lack capability and sophistication to truly reflect reality • Outcome of computation also depends on available computing power

  7. Definitions • Sequence alignment • Dynamic Programming • Global/ Local Alignment • Sequence Identity • Phylogenetics • Paralog/ homolog • Proteomics • Genomics • Transcriptomics • Annotation • BLAST • Sequence assembly • Contig

  8. ‘Omics’ • Genomics • Proteomics • Transcriptomics • Phylolomics etc. • Genomics • Structural • Functional

  9. Structural Genomics • Deals with genome structures • Focus on study of • Genome mapping • Genome sequencing and assembly • Genome annotation • Genome comparison

  10. Structural Genomics:Genome mapping • Identify relative locations of • Genes • Mutations or • Traits

  11. Increasing Resolution Structural Genomics:Genome mapping Cytological Map Genetic Map Physical Map DNA Sequence Image adapted from “Essential Bioinformatics” by Jin Xiong For more info: Look at Chap 5 in “Genomes”, T.A. Brown (572.86 MAL) in the UWC Library or the online version at http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=genomes.chapter.6196

  12. Increasing Resolution Structural Genomics:Genome mapping Cytological Map Genetic Map Physical Map DNA Sequence Image adapted from “Essential Bioinformatics” by Jin Xiong For more info: Look at Chap 5 in “Genomes”, T.A. Brown (572.86 MAL) in the UWC Library or the online version at http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=genomes.chapter.6196

  13. Increasing Resolution * * * * * Structural Genomics:Genome mapping Cytological Map Genetic Map Physical Map DNA Sequence Image adapted from “Essential Bioinformatics” by Jin Xiong For more info: Look at Chap 5 in “Genomes”, T.A. Brown (572.86 MAL) in the UWC Library or the online version at http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=genomes.chapter.6196

  14. Increasing Resolution * * * * * agctggatttgcgcgcaa Structural Genomics:Genome mapping Cytological Map Genetic Map Physical Map DNA Sequence Image adapted from “Essential Bioinformatics” by Jin Xiong For more info: Look at Chap 5 in “Genomes”, T.A. Brown (572.86 MAL) in the UWC Library or the online version at http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=genomes.chapter.6196

  15. Structural Genomics:Genome sequencing • Shotgun sequencing • Genome is fragmented and cloned • Random sequencing of both ends of cloned DNA • High numbers of random sequences • It statistically ensures the whole genome is covered • Software used to assemble the random fragments into a single, contiguous genome

  16. Structural Genomics:Shotgun sequencing http://www.scq.ubc.ca/wp-content/uploads/2006/08/shotgun1.gif

  17. Structural Genomics:Genome sequencing • Hierarchical sequencing • 100-300kb genomic cloned into a BAC • Using a physical map, order and locations of BAC clones on chromosome can be determined • Successive sequencing of adjacent BAC clones result in coverage of the complete genome

  18. Structural Genomics:Hierarchical sequencing http://www.scq.ubc.ca/wp-content/uploads/2006/08/topdownseq.gif

  19. Structural Genomics:Shotgun vs Hierarchical Shotgun Hierarchical

  20. CCAATAA CACCATT TATAAT AATTGGCA TTGAATA Structural Genomics:Genome assembly • Sequence fragments are stitched together through the overlapping sequences between fragments

  21. Structural Genomics:Genome annotation • Happens before submission to database • Gene prediction: GenScan, FgenesH • Verify predictions • BLAST search against sequence database • Compare to experimentally determined cDNA and EST sequences: GeneWise, Spidey, SIM4, EST2Genome • Manual checking by human curators • • Functional assignment • BLAST Homology searching against protein database • Search protein motif and domain databases: Pfam and Interpro

  22. Structural Genomics:Genome annotation http://hinvlite.sanbi.ac.za

  23. Structural Genomics:Genome comparison • Comparison of • Gene number • Gene location • Gene content • Reveals extent of conservation between genomes • Reveals core set of genes crucial for survival; the “Minimal Genome”

  24. Structural Genomics:Genome comparison http://www.sanger.ac.uk/Software/ACT/

  25. Functional Genomics • Focus on gene function • On genome level, using • High throughput methods • Conducted using • Sequence-based • Microarray-based methods

  26. Functional Genomics:Sequence-based • Expressed Sequence Tag (EST) • Provide rough estimate of actively expressed genes under specific physiological conditions • Serial Analysis of Gene Expression (SAGE) • Provides quantitative analysis of mRNA expression • Occurrence and quantity of a specific fragment indicates level of gene expression

  27. Functional Genomics:ESTs • Selected mRNA sequences are reverse transcribed into cDNA clones • cDNA clones are then sequenced • Obtained from 5’ or 3’ end • Typically 500bp long http://www.ncbi.nlm.nih.gov/About/primer/est.html

  28. Functional Genomics:ESTs • EST Limitations • Often low quality • Contamination (vector) • Chimera • Represent partial genes • Despite this ESTs are still widely used (www.ncbi.nlm.nih.gov/dbEST)

  29. Functional Genomics • EST Gene index construction • Organise and consolidate ESTs s.t. data can be used to extract full-length cDNAs • Remove contaminants • Mask repeats • Cluster sequences • Within a cluster, assemble overlapping ESTs into contigs/ consensus sequences • Annotation: similar to process for genome • Examples: Unigene, StackPack, TGI

  30. Functional Genomics:SAGE • Short DNA fragment (15-20bp) is cut from a cDNA and used as unique marker for that transcript • Fragments are concatenated, cloned and sequenced http://www.sagenet.org/protocol/MANUAL1e.pdf

  31. Functional Genomics: Microarrays • Immobilised probes (oligonucleotides or cDNA) are ‘spotted’ on a chip • Probes are representative of a complete genome • Fluorescent cDNA from organism is allowed to hybridise with the probes • Intensity of fluorescence per spot reflect the amount of mRNA present

  32. Proteomics:Technology • 2D-Page Gel: Separates proteins based on charge and mass • Melanie, CAROL, Comp2Dgel, SWISS-2DPAGE • Mass Spectrometry (MS): peptide is fragmented, aspirated and the mass-to-charge ratio is determined • Database searching: Using peptide fingerprint obtained from MS, a database can be searched • ExPASY: AAcompIdent, TagIdent, PeptIdent, CombSearch • ProFound, Mascot

  33. Proteomics:Technology • Differential In-gel Electrophoresis (DIGE) • Proteins from experimental and control samples are labeled with different colored dyes • Differentially expressed proteins can be co- separated and visualised on the same gel

  34. Proteomics:Technology • Protein Microarrays • Chip contains immobilised proteome • Used to study protein function • Assay • Protein-protein interaction • Protein-DNA/ RNA interactions • Protein-ligand interactions • Enzyme activity

  35. Proteomics:Post-translational Modifications • For activity, many proteins have to be covalently modified before or after folding process • Proteolytic cleavage, formation of disulfide bonds, addition of phosphoryl, methyl, acetyl groups, etc. • Modifications impact protein function • Bioinformatics can predict sites for modification • AutoMotif, Cysteine, FindModand GlyMod(available from ExPASY), RESID

  36. Proteomics:Protein Sorting • Sub-cellular localisation is integral to protein function • Many proteins are only active when after being transported to specific compartments • Identifying protein localisation is important in functional annotation • SignalP, TargetP, PSORT

  37. Proteomics:Protein-protein Interactions • Experimental determination • Prediction based on • Domain fusion • Gene neighbours • Sequence homology • Phylogenetic information • Hybrid methods

More Related