440 likes | 762 Views
Proteomics. Jen,Mona & Krishna. Introduction. What is proteome? proteome is the entire complement of proteins, including the modifications made to a particular set of proteins, produced by an organism or system at particular time and conditions.
E N D
Proteomics Jen,Mona & Krishna
Introduction • What is proteome? • proteome is the entire complement of proteins, including the modifications made to a particular set of proteins, produced by an organism or system at particular time and conditions. • varies with time and distinct requirements, or stresses, that a cell or organism undergoes.
What is proteomics? • Proteomics is the large-scale study of proteins, particularly their functions and structures. • A short list of protein modifications that might be studied under proteomics include: • phosphorylation • ubiquitination • methylation • acetylation • glycosylation • oxidation • Nitrosylation etc.
Why proteomics? • Gives better understanding of an organism than Genomics. • Limitations of genomics that made proteomics a better approach: • the level of transcription of a gene gives only a rough estimate of its level of expression into a protein. • many transcripts give rise to more than one protein, through alternative splicing or alternative post-translational modifications. • many proteins form complexes with other proteins or RNA molecules, and only function in the presence of these other molecules.
4. proteins experience post-translational modifications that profoundly affect their activities. 5. protein degradation rate plays an important role in protein content. • Any cell may make different sets of proteins at different times, or under different conditions. Furthermore, any one protein can undergo a wide range of post-translational modifications. So proteomics study can be complex. Therefore, proteomics is a better approach but complex.
Branches of proteomics • Proteomics analysis Determining proteins which are post-translationally modified • Expression proteomics Profiling of expressed proteins using quantitative methods • Cell mapping proteomics Identification of protein complexes
Methods • Gel based proteomics(2DE): • older approach • Separates proteins according to charge in the first dimension and according to the size in the second dimension. • Commonly separated using polyacrylamide gel electrophorosis(PAGE). • Identifies individual proteins in complex samples or multiple proteins in single sample.
2.Mass spectrometry based proteomics: • Highly accurate for extremely low mass particles. • Proteins are cleaved into peptides with enzymatic protease and the peptide masses are detected with the help of mass spectrometer(eg TOF) • The mass spectrum of the peptides is obtained and it is converted to a list of peptide masses that is searched against the genome databases. • Since, each protein has a unique peptide mass fingerprint, peptide masses can identify the protein in the database.
3.Protein arrays • Idea is similar to cDNA arrays. • Substrate is bound on the surface of array • Sample is introduced, binding takes place • Detection and analysis. • Analysis of protein-protein, protein-DNA or protein-RNA interactions can be done.
Applications • Identification of potential new drugs for the treatment of diseases. This relies on genome and proteome information to identify proteins associated with a disease, which computer software can then use as targets for new drugs. • Biomarkers A number of techniques allow to test for proteins produced during a particular disease, which helps to diagnose the disease quickly.
Examples of biomarkers • Alzheimer's disease In Alzheimer’s disease, elevations in beta secretase create amyloid/beta-protein, targeting this enzyme decreases the amyloid/beta-protein and slows the progression of the disease • Heart disease Standard protein biomarkers for CVD include interleukin-6, interleukin-8, serum amyloid A protein, fibrinogen, andtroponins.
Introduction – Current State • Many different informational protein databases available online • Most databases are focused on protein identification • Research community provides the data that drives the database contents • Validation of Mass Spec data • Single vs. Multiple Species Support
Overview of Databases • NCBI – Protein / Peptidome • Human Gene and Protein Database (HGPD) • Human Proteinpedia / Human Protein Reference Database (HPRD) • Dynamic Proteomics • Open Proteomics Database • Global Proteome Machine Database • Peptide Atlas • Proteomics Identifications Database (PRIDE) • UniProt Knowledgebase
NCBI – Protein / Peptidome • Two databases contained in the Entrez suite • Multi-species result sets • Protein • Provides gene information pertaining to the expressed protein queried • Peptidome • Mass Spec based protein identification database • Experiment based result sets
Human Gene and Protein Database (HGPD) • Several cDNA contributors, spanning the globe • Gateway Expression System • Allows for reproducible clone library. Clones are available for purchase. • Wheat Germ Cell-free protein synthesis • Protein Expression portion of the database. Allows for visualization of the SDS-PAGE results.
Human Proteinpedia / Human Protein Reference Database (HPRD) • Modeled after wikipedia • Users submit and edit the data in the database • Differences • Original submitter expected to provide experimental evidence for the data • Only the original submitter can edit that specific data later. • Allows several protein features to be annotated • Post-translational modification • Tissue expression • Cell line expression • Subcellular localization • Enzyme substrates • Protein-protein interactions
Human Proteinpedia / Human Protein Reference Database (HPRD) • No visual protein expression data • Protein amino acid sequence given • Raw and processed mass spec files are available as experimental evidence • Provides links to the protein in other databases
Dynamic Proteomics • Different type of database, focusing on the dynamics of proteins treated with an anti-cancer drug • Shows different uses for data repositories for proteomics • Not just all-encompassing data source with generic data. • Using simple databases and web front ends to make more specific types of data available to the community. • Also provides links to other databases • Can compare multiple sequences at once to search the cDNA library.
Dynamic Proteomics Time lapse microscopy movies that illustrate the protein dynamics in individual living human cancer cells in response to an anti-cancer drug Time Lapse Video
Open Proteomics Database • University of Texas • Multi-species results • Smaller pool of data submitted for query
Global Proteome Machine Database • Private industry involvement • Mass Spec Validation • Protein Identification • Utilizes data from other databases • Differs from the scheme of just linking to other protein databases
Peptide Atlas • Seattle Proteome Center • Focused on subset of human proteins • Heart, Lung, Blood • Funded by NIH • Part of the Trans-Proteomic Pipeline software suite
Proteomics Identifications Database (PRIDE) • One of the earlier proteomic databases • European Bioinformatics Institute • Larger selection of species specific data • Java based, available for local deployment
UniProt Knowledgebase • Swiss Institute of Bioinformatics • Also curated by European Bioinformatics Institute • Funded by NIH • Forced the conversion of earlier non-public versions to become free and open
Overview of Tools • ExPAsy Proteomics Server • Trans-Proteomic Pipeline
ExPAsy Proteomics Server • Swiss Institute of Bioinformatics tool suite • Protein ID by amino acid sequence • Isoelectric Point Computation • Prediction of post translational modifications and amino acid substitutions. • Predicts protein cleavage sites • Protein identification by molecular weight
Trans-Proteomic Pipeline • Seattle Proteome Center
Challenges • Large number of data sources • Parallel efforts • Validation of Mass Spec data
Future Considerations • Selection of a few ‘primary’ data repositories • Consolidation of multiple redundant efforts being funded by the same agency • Particularly NIH • Data standards to streamline the submission of results into multiple data sources. • Reduction of the need to perform many searches to find information about a protein • mzXML is a start, but only covers mass spec data
Database References • NCBI • Protein http://www.ncbi.nlm.nih.gov/protein/ • Peptidome http://www.ncbi.nlm.nih.gov/pepdome • Human Gene and Protein Database (HGPD) • http://riodb.ibase.aist.go.jp/hgpd/cgi-bin/index.cgi • Human Proteinpedia • http://www.humanproteinpedia.org/index_html • Human Protein Reference Database (HPRD) • http://www.hprd.org/ • Dynamic Proteomics • http://alon-serv.weizmann.ac.il/dynamprotb/seqsrch • Open Proteomics Database • http://bioinformatics.icmb.utexas.edu/OPD/ • Global Proteome Machine Database • http://thegpm.org • Peptide Atlas • http://www.peptideatlas.org/ • Proteomics Identifications Database (PRIDE) • http://www.ebi.ac.uk/pride/ • UniProtKnowledgebase • http://www.uniprot.org/
Tool References • ExPAsy Proteomics Server • http://www.expasy.ch/ • Trans-Proteomic Pipeline • http://tools.proteomecenter.org/wiki/index.php?title=Software:TPP
Applications of Proteomics Mona Motwani
Discovery of protein biomarkers A biomarker can be defined as any laboratory measurement or physical sign used as a substitute for a clinically meaningful end point that measures directly how a patient feels, functions or survives as applied to proteomics, a biomarker is an identified protein(s) that is unique to a particular disease state. • Biomarkers of drug efficacy and toxicity are becoming a key need in the drug development process. • Mass spectral-based proteomic technologies are ideally suited for the discovery of protein biomarkers in the absence of any prior knowledge of quantitative changes in protein levels. • The success of any biomarker discovery effort will depend upon the quality of samples analysed, the ability to generate quantitative information on relative protein levels and the ability to readily interpret the data generated.
Study of Tumor Metastasis and Cancers • The identification of protein molecules with their expressions correlated to the metastatic process help to understand the metastatic mechanisms and thus facilitate the development of strategies for the therapeutic interventions and clinical management of cancer. • Information contained within proteomic patterns has been demonstrated to detect ovarian, breast and prostate cancers with sensitivities and specificities greater than 90%.
Field of Neurotrauma • Neurotrauma results in complex alterations to the biological systems within the nervous system, and these changes evolve over time. • Near-completion of the Human Genome Project has stimulated scientists to begin looking for the next step in unraveling normal and abnormal functions within biological systems. Consequently, there is new focus on the role of proteins in these processes. • Proteomics is a burgeoning field that may provide a valuable approach to evaluate the post-traumatic central nervous system (CNS). However the senstivity of the tissue and detection of potential biomarkers are major concern.
Renal disease diagnosis • Proteomics has also found significant application in studying the effects of chemical insults on the kidney, particularly as a result of environmental toxins, drugs and other bioactive agents. • Combining classic analytical techniques as two-dimensional gel electrophoresis and more sophisticated techniques, such as MS, liquid chromatography has enabled considerable progress to be made in cataloguing and quantifying proteins present in urine and various kidney tissue compartments in both normal and diseased physiological states. • Critical developmental tasks that still need to be accomplished are completely defining the proteome in the various biological compartments (e.g. tissues, serum and urine) in both health and disease, which presents a major challenge given the dynamic range and complexity of such proteomes; and also achieving the routine ability to accurately and reproducibly quantify proteomic expression profiles and develop diagnostic platforms.
Neurology • In neurology and neuroscience, many applications of proteomics have involved neurotoxicology and neurometabolism, as well as in the determination of specific proteomic aspects of individual brain areas and body fluids in neurodegeneration. • Investigation of brain protein groups in neurodegeneration, such as enzymes, cytoskeleton proteins, chaperones, synaptosomal proteins and antioxidant proteins, is in progress as phenotype related proteomics. • The concomitant detection of several hundred proteins on a gel provides sufficiently comprehensive data to determine a pathophysiological protein network and its peripheral representatives. An additional advantage is that hitherto unknown proteins have been identified as brain proteins.
Autoantibody profiling • Proteomics technologies enable profiling of autoantibody responses using biological fluids derived from patients with autoimmune disease. • They provide a powerful tool to characterize autoreactive B-cell responses in diseases including rheumatoid arthritis, multiple sclerosis, autoimmune diabetes, and systemic lupus erythematosus. • Autoantibody profiling may serve purposes including classification of individual patients and subsets of patients based on their 'autoantibody fingerprint', examination of epitope spreading and antibody isotype usage, discovery and characterization of candidate autoantigens, and tailoring antigen-specific therapy.
Alzheimer's disease • In Alzheimer’s disease, elevations in beta secretase create amyloid/beta-protein, which causes plaque to build up in the patient's brain, which is thought to play a role in dementia. • Targeting this enzyme decreases the amyloid/beta-protein and so slows the progression of the disease. • A procedure to test for the increase in amyloid/beta-protein is immunohistochemical staining, in which antibodies bind to specific antigens or biological tissue of amyloid/beta-protein.
Heart disease • Heart disease is commonly assessed using several key protein based biomarkers. Standard protein biomarkers for CVD include interleukin-6, interleukin-8, serum amyloid A protein, fibrinogen, and troponins. • cTnI cardiac troponin I increases in concentration within 3 to 12 hours of initial cardiac injury and can be found elevated days after an acute myocardial infarction. • A number of commercial antibody based assays as well as other methods are used in hospitals as primary tests for acute MI.
Future Challenges • There is a need for biomarkers with more accurate diagnostic capability, particularly for early-stage disease. • Also adding a quality control sample on each chip array, and normalizing spectral data through commercially available or in-house generated computer programs • Another challenge that proteomics techniques face lie largely in the application of bioinformatics, i.e. the spectral data management and analysis. The vast amount of spectral data generated demand implementation of advanced data management and analysis strategies. • Finally, the obvious challenge, as stated by many investigators, is the identification of the important proteins and peptides that contribute to the proteomic analysis.