120 likes | 203 Views
Characterization of Transmembrane Helices. Madhavi Ganapathiraju. Summary. Completion of classification procedures for TM prediction using the LSA features Web-tool for the TM prediction has been designed; it is being developed by Christopher Jursa
E N D
Characterization of Transmembrane Helices Madhavi Ganapathiraju JKS-seminar
Summary • Completion of classification procedures for TM prediction using the LSA features • Web-tool for the TM prediction has been designed; it is being developed by Christopher Jursa • TMPDB, a set of 119 transmembrane proteins has also been processed and included in evaluations • KchannelDB, the database of Kchannel proteins subdiviided into families of 1, 2, 4 and 6 TMs each has been collected and processed. First 2 have been evaluated. • Decision tree and support vector machine classifiers have been evaluated • Paper summarizing the work has been written • Qok metric has been found to be incorrect in previous evaluations – It has been corrected. JKS-seminar
Recap: TM prediction method JKS-seminar
Neural Net Classifier Hidden Layer Input Layer Dimension 1 Output Layer 4 Dimensions of the Vector obtained by LSA form the input Dimension 2 Dimension 3 Dimension 4 JKS-seminar
Decision Tree & SVM Classifiers • Used MATLABarsenal, the wrapper tools developed by Rong (LTI) to see the performance of classifiers on the feature set • Decision Trees • SVM (2nd degree polynomial kernel) JKS-seminar
Evaluation Data Sets • Benchmark • 36 proteins of high resolution TM information • TMPDB • 119 proteins of known 3D structure • KChannelDB • Multiple sequence alignments of KChannel proteins of 1 and 2 TM segments JKS-seminar
Results: 36 high res Evaluations have been performed by submitting data on benchmark server JKS-seminar
Results: TMPDB JKS-seminar
Other things • Processed KChannel DB proteins for evaluation • Initial evaluations are done, but not ready for discussion • … JKS-seminar
TMPro web service • TMPro website is being developed by Christopher, Dr. Karimi’s student • Should be up in 2 weeks time • Developed standalone versions of feature processing required for the web-service for DT and SVM JKS-seminar
Charge rich proteins • I seem to have not mailed myself the latest figures here, I will show them separately JKS-seminar
Ongoing work • Qok is not high for TMPDB data set • To overcome this, error analysis is being performed • Measure how far away from “truth” the prediction is (what threshold would have classified the segment correctly as TM or non TM) • Characteristics of the segments misclassified • Are they traditional globular hydrophobic segments only, can aromatic and other properties be used to recover from error? • Combination with TMHMM prediction for improved performance • Rule based combination on aromatic property has previously been shown to improve TMHMM predictions (March/June 2005?) on high resolution proteins • Do this on TMPDB set as well • Other architectures of NN to be studied? Error TM segments to be studied further with DT rules that fail JKS-seminar