410 likes | 554 Views
Eukaryotic Secretome Prediction and Knowledge-Base Development. Xiang-Jia “Jack” Min Ph.D., Assistant Professor. 2 nd International Conferences on Proteomics & Bioinformatics. Las Vegas, July 2 - 4, 2012 . DNA. RNA. protein. phenotype. Genome. Transcription. mRNA
E N D
Eukaryotic Secretome Prediction and Knowledge-Base Development Xiang-Jia “Jack” Min Ph.D., Assistant Professor 2nd International Conferences on Proteomics & Bioinformatics. Las Vegas, July 2 - 4, 2012
DNA RNA protein phenotype
Genome Transcription mRNA (protein-coding DNA sequences) Transcriptome Translation Protein sequences Proteome Secretion Proteins with secretory signal peptide Secretome
Fungi Yeasts Moulds Mushrooms secreted enzymes Small molecules Biomaterials Enzymes Bio-fuels Biomaterials
Genome Transcription Transcriptome Translation Proteome Secretion Secretome How to identify secreted proteins? • Direct identification using proteomics methods (Tsang et al. 2009) • Computational prediction from predicted proteome • EST data mining
Secreted Proteins • Classical secreted proteins have a signal peptide at N-terminus; • Not all proteins have a signal peptide are secreted: • Signal peptide = secreted protein
SignalP:a program to predict if a protein contains a signal peptide. Phobius:signal peptide and transmembrane domain predicton. WolfPsort: a multiple subcellular location predictor TargetP: detect proteins targeted to mitochondria. TMHMM:transmembrane domain prediction. PS-Scan: detection ER-retention signals
Data Secreted Non-secreted Fungi 241 5,992 Animals 5,568 19,048 Plants 216 7,528 Protists 32 1,979
Method • Sensitivity (%) = TP/(TP + FN) x 100 • Specificity (%) = TN/(TN + FP) x 100 • Mathews’ Correlation Coefficient (MCC) MCC (%) = (TP x TN – FP x FN) x 100 /((TP + FP) (TP + FN) (TN + FP) (TN + FN))1/2
Table 1. Prediction accuracies of secreted proteins in fungi TP: true positives; FP: false positives; TN: true negatives; FN: false negatives. Sn: sensitivity; Sp:specificity;MCC: Mathews' correlation coefficient. Min XJ (2010) JPB 3:143-147.
Table 2. Prediction accuracies of secreted proteins in animals TP: true positives; FP: false positives; TN: true negatives; FN: false negatives. Sn: sensitivity; Sp:specificity; MCC: Mathews' correlation coefficient. Min XJ (2010) JPB 3:143-147.
Table 3. Prediction accuracies of secreted proteins in plants TP: true positives; FP: false positives; TN: true negatives; FN: false negatives. Sn: sensitivity; Sp:specificity; MCC: Mathews' correlation coefficient. Min XJ (2010) JPB 3:143-147.
Summary • Different prediction tools have different accuracies for prediction of secretomes in different kingdoms of species; • Combining these tools often increases the prediction accuracy. However, differential combination are needed for species in different kingdoms. • Optimal methods are proposed.
Views User Inputs Database External Links gi accession UniProt ID Keywords Species RefSeq UniProt Prediction Tools FunSecKB SignalP Phobius WolfPsort TargetP TMHMM PS-SCAN fragAnchor Subcellular Location Manual Curation Lum G & Min XJ (2011) Database.
Summary of FunSecKB • Currently the database contains a total of 478,073 fungal protein sequences • 23,878 predicted and / or curated secreted proteins • A total of 118 fungal species including 52 fungal species having a complete proteome
Acknowledgements Gengkon Lum (M. S. Graduate) Jessica Orr (Undergraduate) Docylyne Shelton (Undergraduate) Braden Walters (Undergraduate)