150 likes | 306 Views
A Multi-PCA Approach to Glycan B iomarker Discovery using Mass Spectrometry Profile Data. Anoop Mayampurath , Chuan- Yih Yu Info-690 ( Glycoinformatics ) Final Project Presentation. Background. [1] Kyselova et al. “Alterations in the Serum Glycome Due to Metastatic
E N D
A Multi-PCA Approach to Glycan Biomarker Discovery using Mass Spectrometry Profile Data AnoopMayampurath, Chuan-Yih Yu Info-690 (Glycoinformatics) Final Project Presentation
Background [1] Kyselova et al. “Alterations in the Serum Glycome Due to Metastatic Prostate Cancer “ Journal of Proteome Research, 2007, 6:1822-1832
[2] Tang et. al “Identification of N-Glycan Serum Markers Associated with Hepatocellular Carcinoma from Mass Spectrometry Data” Journal of Proteome Research, 2009, Article ASAP [3] Ressom et. al “Analysis of MALDI-TOF Mass Spectrometry Data for Discovery of Peptide and Glycan Biomarkers of Heptacelluar Carcinoma, Journal of Proteome Research, 2008, 7:603
Objective • Given a set of N mass spectra(disease and healthy), develop an algorithm that identifies “significant” spectra and glycan peaks • From the significant glycan peaks • Nature of regulation between disease and healthy • Study of effects such as fucosylation and linkage • From the significant spectra • A smaller set of spectra m << N that help in analysis • Glycan annotation • Check for overlapping glycans • What is meant by “significant”? • Elements that exhibit coherent patterns and large variation between disease and healthy • Datasets • 151 MALDI TOF mass spectra : 73 cancer, 78 normal
Details • Background subtraction • Peak Picking • Identification of common glycans across all 151 spectra • Filtering using Fit Coefficient cutoff > 0.5 • 30% of spectra has glycan fit coefficient greater that 0.5, then retain • A Nxp matrix X is obtained (N : number of glycans, p: number of spectra)
Multi-PCA algorithm • Perform PCA • Perform inner-product • Sort glycans by inner product (which measure correlation) • Shave off 10% of glycans with the lowest inner product score • Repeat [4] Hastie et. al ‘‘Gene shaving’ as a method for identifying distinct sets of genes with similar expression patterns’, Genome Biology 2000, 1(2):1-21
Multi-PCA Algorithm X Sort by inner product, shave of 10% of glycans • The algorithm was iterated until 10 glycan values were acquired. The glycans are supposed to be coherent in intensity changes while having high variance between cancer and no cancer • We also switched dimensions to shave off spectra. The algorithm was iterated until we got 6 spectra
Results Total Intensity Total Intensity Mass value
Not present in original composition file Filtered out
Total Intensity Mass value
Significant Spectra • No overlapping glycans were found
Future Directions • Fragmentation of glycans to study effect of linkage among glycans • Glycan microarray • More detail on overlapping glycans (substitute single score by combined score) • Orthogonalize the data to see other patterns.
Acknowledgements • Prof. Haixu Tang, School of Informatics & Computing • Prof. YehiaMechref, Dept of Chemistry