1 / 81

“Proteomics & Bioinformatics”

“Proteomics & Bioinformatics”. MBI, Master's Degree Program in Helsinki, Finland. Lecture 4. 10 May, 2007. Sophia Kossida , BRF, Academy of Athens, Greece Esa Pit känen , Univeristy of Helsinki, Finland Juho Rousu , University of Helsinki, Finland. Proteomics and biology /Applications.

Download Presentation

“Proteomics & Bioinformatics”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Proteomics & Bioinformatics” MBI, Master's Degree Program in Helsinki, Finland Lecture 4 10 May, 2007 Sophia Kossida, BRF, Academy of Athens, Greece Esa Pitkänen, Univeristy of Helsinki, Finland Juho Rousu, University of Helsinki, Finland

  2. Proteomics and biology /Applications Protein Expression Profiling Identification of proteins in a particular sample as a function of a particular state of the organism or cell Proteome Mining Identifying as many as possible of the proteins in your sample Post-translational modifications Identifying how and where the proteins are modified DATABASES & Functional proteomics TOOLS Protein-protein interactions Protein-network mapping Determining how the proteins interact with each other in living systems Protein quantitation or differential analysis Structural Proteomics

  3. Databases and tools Melanie

  4. General workflow of proteomics analysis Proteins/peptides Digestion and/or separation 2D gel image aquisition and storage External data sources taxonomy, ontologies, bibliography… Applications Systems biology (pathways, interactions..) biomarker-discovery, drug targets MALDI, MS/MS Store peak lists and all meta data Identification Quantification PMF MS/MS DIGE LC-MS & Tags

  5. General workflow of proteomics analysis Digestion and/or separation Make 2D Proteins/peptides 2D Page data bases Swiss 2D PAGE, Gelbank, Cornelia, WordPAGE Imaging tools: Melanie, PDQuest Progenesis Delta 2D Sequence data bases: EMBLNucleotide Sequence DatabaseGenBank UniProtKB/Swiss-Prot & TrEMBL Ensemble EST database PIR KEGG PDB DIP OMIM Reactome PROSIT Pfam SPIN BOND STRING AmiGO David PubMed MEDLINE Storing/ organising: Proteincsape MSight MALDI, MS/MS Mascot Sequest Aldente Popitam Phenyx FindMod Profound PepFrag MS-Fit OMSSA Search XLinks TagIdent Identification Quantification

  6. Digestion and/or separation Proteins/peptides General workflow of proteomics analysis Make 2D 2D Page data bases • Imaging Softwares: • The ability to compare two gels (images) and then identify differently expressed spots • Melanie • PDQuest • Progenesis • Delta 2D • 2D gel databases: • Data integration on the web • Image data and textual information • Swiss 2D PAGE • Gelbank • Cornelia • WordPAGE Proteinscape –platform for storing, organizing data MSight -representation of mass spectra along with data from the separation

  7. 2D Gel Databases Swiss-2DPAGEwww.expasy.ch GelBankhttp://www.gelscape.ualberta.ca:8080/htm/gdbIndex.html Cornea 2D-PAGEhttp://www.cornea-proteomics.com/ World 2DPAGE, Index of 2D gel databases http://ca.expasy.org/ch2d/2d-index.html

  8. Swiss 2D PAGE viewer

  9. Gel bank

  10. Cornea

  11. World-2DPAGE http://ca.expasy.org/ch2d/2d-index.html

  12. Make 2D database A software package to create, convert, publish, interconnect and keep up to date 2DE-databases. Provided by ExPASY The database is queryable via description, accession or spot clicking. Cross-references are provided to other federated 2D PAGE database entries, Medline and SWISS-PROT Entries are linked to images showing the experimentally determined and theoretical protein locations. Search via –clickable images, -keywords Data can be marked to be public, as well as fully or partially private. An administration Web interface, highly secured, makes external data integration, data export, data privacy control, database publication and versions' control a very easy task to perform. It runs on most UNIX-based operating systems (Linux, Solaris/SunOS, IRIX). Being continuously developed, the tool is evolving in concert with the current Proteomics Standards Initiative of the Human Proteome Organization (HUPO).

  13. Federated database A collection of databases that are treated as one entity and viewed through a single user interface (pc.mag.com) Robustness Consistency Maintenance of the database Data quality Limitations of current databases: Do not contain strict/detailed descriptions of protocol (buffers, sample volume, staining techniques all important information for gel comparisons). Designed as 2D (and not proteomics) databases and therefore not readily expandable to incorporate other proteomics data e.g. MS, MDLC. Designed for reference gels, not on-going projects.

  14. Guidelines for building a federated 2-DE database http://ca.expasy.org/ch2d/fed-rules.html Individual entries in the database must be accessible by a keyword search. Other methods are possible but not required. The database must be linked to other databases by active hypertext cross-references, linking together all related databases. Database entries must be at least linked to the main index. A main index has to be supplied that provides a means of querying all databases through one unique query point. Individual protein entries must be available through clickable images. 2DE analysis software designed for use with federated databases, must be able to access individual entries in any federated 2DE databases. for a complete reference, see Appel et al., Electrophoresis 17, 1996, 540-546, 1996):

  15. Image analysis software ImageMaster2D/ Melanie PDQuest(Bio-Rad, USA) Progenesis (Nonlinear, UK) Delta2D(Decodon, Germany)

  16. Melanie http://au.expasy.org/melanie/

  17. Melanie http://www.2d-gel-analysis.com/

  18. PDQuest http://www.bio-rad.com/

  19. Progenesis http://www.nonlinear.com/products/progenesis/

  20. Delta 2D http://www.decodon.com/Solutions/Delta2D/

  21. ProteinScape Platform for storing, organizing, analyzing data generated during the proteomics workflow. • Hierarchy: Project Sample Gel Spots MS Data Search Events

  22. MSight Specifically developed for the representation of mass spectra along with data from the separation http://www.expasy.org/MSight

  23. Digestion and/or separation Proteins/peptides 2D gel image aquisition and storage General workflow of proteomics analysis Sequence data bases: EMBL Nucleotide Sequence Database GenBank UniProtKB/Swiss-Prot & TrEMBL Ensemble EST database PIR MALDI, MS/MS Store peak lists and all meta data PMF MS/MS DIGE LC-MS & Tags Identification Quantification

  24. EMBL Nucleotide Sequence Database Collaboration between GenBank (USA) and DNA Database of Japan (DDBJ) and EBI. New collected sequence data is exchanged, and each database is updated daily.

  25. EBI

  26. GenBank Gen Bank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. GenBank is available for searching at NCBI Each entry includes a concise description of the sequence, the scientific name and the taxonomy of the source organism, and a table of features that identifies coding regions and other sites of biological significance, such as transcription units, sites of mutations or modifications and repeats. Protein translations for coding regions are included in the feature table. Bibliographic references are included along with a link to the Medline unique identifier for all published sequences. http://www.psc.edu/general/software/packages/genbank/genbank.html

  27. Search GenBank http://www.ncbi.nlm.nih.gov/Genbank/index.html

  28. DDBJ

  29. INSDC

  30. UniProt Universal Protein Resource • Joining the information contained in UniProtKB/Swiss-Prot, UniProteKB/TrEMBL and PIR. • It is comprised of three components • UniProt Knowledge base (curated protein information, including function, classification, and cross-reference. • UniProt Reference Clusters (combines closely related sequences into a single record to speed searches.) • UniProt Archive (is a repository, reflecting the history of all protein sequences)

  31. http://www.isb-sib.ch/ ExPASy Proteomics Server Expert Protein Analysis System Proteomics server of the Swiss Institute of Bioinformatics (SIB) is dedicated to the analysis of protein sequences and structures as well as 2D-PAGE. http://ca.expasy.org/

  32. UniProtKB/Swiss-Prot The UniProt KB/Swiss-Prot Protein Knowledgebase is a annotated protein sequence database established in 1986. It is maintained collaboratively by the SIB (Swiss Institute of Bioinformatics) and the European Bioinformatics Institute (EBI) http://ca.expasy.org/sprot/

  33. Swiss Prot

  34. TrEMBL • Uni ProtKB/TrEMBL is a computer-annotated protein sequence database complementing the UniProtKB/Swiss-Prot Protein Knowledgebase. • It contains the translations of all coding sequences (CDS) present in the EMBL/GenBank/DDBJ Nucleotide Sequence Databases and also protein sequences extracted from the literature or submitted to UniProtKB/Swiss-Prot. • The database is enriched with automated classification and annotation.

  35. PIR http://pir.georgetown.edu/pirwww/

  36. ESTdb Expressed Sequence Tags, EST is a unique DNA sequence within a coding region of a gene that is useful for identifying full-length genes and serves as a landmark for mapping. The dbEST is a division of GenBank that contains sequence data and other information on “singke-pass” cDNA sequences, from a number of organisms. http://www.ncbi.nlm.nih.gov/dbEST/

  37. Ensemble Ensemble is a joint project between the EMBL-EBI and the Welcome Trust Sanger Institute that aims at developing a system that maintains automatic annotation of large eukaryotic genomes. Access to all the software and data is free and without constraints of any kind. http://www.ebi.ac.uk/ensembl/

  38. IPI- International Protein Index

  39. Digestion and/or separation Proteins/peptides 2D gel image aquisition and storage General workflow of proteomics analysis Mascot Sequest Aldente Popitam Phenyx FindMod Profound PepFrag MS-Fit OMSSA Search XLinks TagIdent MALDI, MS/MS Store peak lists and all meta data PMF MS/MS DIGE LC-MS & Tags Identification Quantification

  40. Proteomics tools http://restools.sdsc.edu/biotools/biotools19.html http://ca.expasy.org/tools/

  41. PROWL

  42. Identificationand Characterization Tools PMFdata MS/MS data Sequest Mascot OMSSA X!Hunter Mascot(Matrix Science) Aldente(ExPasy) Profound(Rockefeller University) MS-Fit(Prospector; UCSF)

  43. Identificationand Characterization Tools Popitam(ExPASy, SIB) Phenyx –GeneBio, Swizerland) PepFrag(Rockefeller University, USA) SearchXLinks –(Caesar, Germany)

  44. Popitam Popitam is designed to characterize peptides with unexpected modification (e.g. post-translational modifications or mutations) by tandem mass spectrometry (ExPASy, SIB) http://expasy.org/cgi-bin/popitam/help.pl

  45. Popitam results

  46. Phenyx Phenyx is a software platform for the identification and characterization of proteins and peptides from mass spectrometry data. Developed by GeneBio in collaboration with SIB http://www.phenyx-ms.com/about/about_phenyx.html

  47. PEPFRAG Searches known protein sequences with peptide fragment mass information http://prowl.rockefeller.edu/

  48. SearchXLinks http://www.searchxlinks.de/ Analysis of mass spectra of modified, cross-linked, and digested proteins, the amino acid of which is known

  49. Identificationand Characterization Tools FindModpredicts potential protein post-translational modifications (PTM) and finds potential single amino acid substitutions in peptides. FindPeptidentifies peptides that result from unspecific cleavage of proteins from experimental masses, taking into account artefactual chemical modifications, posttranslational modifications (PTM) and protease autolytic cleavage. GlycoModpredicts possible oligosaccharide structures that occur on proteins from their experimentally determined masses. AACompIdent achieves identification with amino acid composition TagIdent identifies proteins with isoelectric point, pI, molecular weight, MW, and sequence tag generating a list of proteins close to a given pI and Mw. Multident achieves cross-species identification with multiple parameters (pI, Mw, sequence tag and peptide mass fingerprinting data) http://au.expasy.org/tools/findmod/

More Related