Progress in Transmembrane Protein Research 12 Month Report Tim Nugent

Progress in Transmembrane Protein Research12 Month ReportTim Nugent

Assignment of PROSITE motifs to topological regions • We explored the possibility that motifs from the PROSITE database could be used as constraints in subsequent topology prediction steps, by identifying a bias in their inside/outside frequency. Extracelullar Cytoplasm

Alpha-helical protein PROSITE motif assignments

Using PROSITE motifs to enhance topology prediction

CLN3 Topology Prediction

CLN3 Topology Prediction • Model is in agreement with all published experimental data. • Potential amphipathic helix. • Bias is hydrophobic/polar residue placement • 2 Arginine residues in close proximity – possible anion channel?

Using Support Vector Machines for Topology prediction • Earlier approaches have relied on physiochemical properties such as hydrophobicity to identify transmembrane helices (e.g Kyte-Doolittle). • Recently, more advanced methods using machine learning algorithms such as hidden Markov models (e.g. TMHMM, PHOBIUS) and neural networks (MEMSAT3) have been developed, • They have achieved significant improvements in prediction accuracy (~80%). • However, none of the top scoring methods use SVMs. • While hidden Markov models and neural networks may have multiple outputs, SVMs are binary classifiers. • In order to deal with TM topology prediction, multiple SVM will have to be combined, e.g. • TM helix / Loop • Inside Loop / Outside Loop • Signal Peptide / TM helix • Re-entrant Loop / TM helix

Helix / Loop SVM Prediction Accuracy • TM helix / Loop SVM: • PSI-BLAST profiles • Normalised by Z-score • 29 residue sliding window • 3rd order polynomial kernel function • Mathews Correlation Coefficient = 0.75 • Precision = 0.86 • Recall = 0.32 • TP= 8384 • FP= 1355 • TN= 17773 • FN= 1969 • Kyte-Doolittle MCC: 0.64 • MEMSAT3 MMC: 0.76 • Overlap of at least 37 sequences between Moller dataset and novel training set.

Inside Loop/Outside Loop SVM Prediction Accuracy • Inside Loop/Outside Loop SVM • 27 residue sliding window • Mathews Correlation Coefficient = 0.60 • Precision = 0.78 • Recall = 0.50 • TP= 4060 • FP=1028 • TN=4081 • FN=1007 • Signal Peptide/TM Helix and Re-entrant Loop/TM Helix SVMs in training!

SVM Results – Glycerol uptake facilitator

SVM Results – Photosystem II subunit A

SVM Results – Particulate Methane Monooxygenase subunit C

SVM Results – Cytochrome b6f subunit A

Further work • Expand training set: ~45 sequences to add. • Additional sequences where the TMH are known but the topology is not can be used to train the Helix/Loop classifier. • Parameter optimisation. • Window size • Kernel type • Signal peptide SVM. • Re-entrant loop SVM. • Combine SVM raw scores/probabilities into a topology.

Whole-Proteome TM Protein Analysis

Identifying Pore-forming TM Helices

Progress in Transmembrane Protein Research 12 Month Report Tim Nugent

Progress in Transmembrane Protein Research 12 Month Report Tim Nugent

Presentation Transcript

Six Month Progress Report

Research report on Protein Crystallization

CD Research Progress Report

Research Activities Progress Report

Using Support Vector Machines for transmembrane protein topology prediction Tim Nugent

Research Progress Report

A Research Progress Report

RESEARCH PERFORMANCE PROGRESS REPORT (RPPR)

Finance Report Month 3 2011-12

SLO 2011-12 Progress Report

Alpha-helical transmembrane protein structure prediction Timothy Nugent

Finance Report Month 9 2011-12

IQMS PROGRESS REPORT 2011/12

Transmembrane Protein Prediction

Finance Report Month 3 2011-12

Support Vector Machine-based Transmembrane Protein Topology Prediction Tim Nugent

RPPR Research Performance Progress Report

Finance Report Month 9 2011-12

Protein Supplement Market Research Report

Research on prediction of transmembrane protein topology based on fuzzy theory

Progress this Month

IQMS PROGRESS REPORT 2011/12