280 likes | 767 Views
InCoB 2009, Singapore René Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics 25 (2009). Junior Research Group for Protein-Protein-Interactions and Computational Proteomics
E N D
InCoB 2009, Singapore René Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics 25 (2009). Junior Research Group for Protein-Protein-Interactions and Computational Proteomics Saarland University, Saarbruecken, Germany
Outline ∙ Introduction & Motivation - The Differential Proteomics Pipeline ∙ Computational Proteomics - Signal Processing and Feature Detection - The Isotope Wavelet Transform ∙ Parallelization via GPUs ∙ Results & Discussion
The Differential Proteomics Pipeline Two probes: e.g. sick vs. healthy List of differentially expressed proteins Mass Spectrometer Applications range from basic pharmaceutical research over medical diagnostics and therapy to biotechnology and engineering.
digest Principle of Biological Mass Spectrometry Peptides are ionized and accelerated Proteins Peptides intensity Fingerprint mass
digest Principle of Biological Mass Spectrometry mass of a single neutron intensity Fingerprint mass
digest Principle of Biological Mass Spectrometry mass of a single neutron intensity Fingerprint mass
(Simple) Feature Finding Typically done by simple thresholding: Needs additional preprocessing steps, like e.g.: - Baseline elimination (e.g. by morphological filters) - Noise reduction and/or smoothing (Mostly) needs resampling Needs additional postprocessing steps, like e.g.: - Peak clustering (so-called “deconvolution”) - Model fitting, charge prediction
The Isotope Wavelet Transform Convolution with a kernel function • - by construction robust against noise and baseline artifacts • also acts as a filter for chemical noise • predicts simultaneously the charge state • needs no explicit resampling • only a single parameter (threshold)
Parallelization via CUDA b-th data point
Parallelization via CUDA b-th data point
Parallelization via CUDA b-th data point
Parallelization via CUDA b-th data point
Parallelization via CUDA b-th data point T0 Tn
Parallelization via CUDA and TBB 1x CPU 2.3 GHz 1x NVIDIA Tesla C870 2x NVIDIA Tesla C870 via Intel Threading Building Blocks >200x speedup
Open Issues – Future Work ∙ Solutions for machine-specific ‘artifacts’, e.g. - Tailing effects in TOF-Analyzers - Severe mass discretization in high resolution data ∙ Tests for MSn spectra - Refined averagine model ∙ Separating overlapping patterns GPU solutions
Availability: OpenMS ∙ An open source C++ library for mass spectrometry ∙ Designed for “users” as well as for “developers” ∙ TOPP - “The OpenMS proteomics pipeline” - suite of independent software tools - include file handling / conversion - peak picking and feature detection - includes visualizer TOPPView … http://www.openms.de
References Hussong, R, Gregorius, B, Tholey, A, and Hildebrandt, A (2009). Highly accelerated feature detection in proteomics data sets using modern graphics processing units.Bioinformatics 25. Schulz-Trieglaff, O, Hussong, R, Gröpl, C, Leinenbach, A, Hildebrandt, A, Huber, C, and Reinert, K (2008). Computational Quantification of Peptides from LC-MS Data.Journal of Computational Biology 15(7). Sturm, M, Bertsch, A, Gröpl, C, Hildebrandt, A, Hussong, R, Lange, E, Pfeifer, N, Schulz-Trieglaff, O, Zerck, A, Reinert, K, and Kohlbacher, O (2008). OpenMS - An open-source software framework for mass spectrometry, BMC Bioinformatics 9(163). Hussong, R, Tholey, A, and Hildebrandt, A (2007). Efficient Analysis of Mass Spectrometry Data Using the Isotope Wavelet In: COMPLIFE 2007: The Third International Symposium on Computational Life Science.American Institute of Physics (AIP) 940. Schulz-Trieglaff, O, Hussong, R, Gröpl, C, Hildebrandt, A, and Reinert, K (2007). A Fast and Accurate Algorithm for the Quantification of Peptides from Mass Spectrometry Data, In: Proceedings of the Eleventh Annual International Conference on Research in Computational Molecular Biology (RECOMB).Lecture Notes in Bioinformatics (LNBI) 4453.
The Isotope Wavelet Transform Convolution with a kernel function • - by construction robust against noise and baseline artifacts • also acts as a filter for chemical noise • predicts simultaneously the charge state • needs no explicit resampling • only a single parameter (threshold) Kernel function charge state 1, mass 1000D Kernel function charge state 1, mass 2000D
The Isotope Wavelet Transform MS spectrum (charge state 3) charge-1-transform charge-2-transform charge-3-transform
The Sweep Line Idea 2 additional parameters: RT_cutoff RT_interleave RT [s] m/z [Th]
digest Open Issues – Future Work Fragment Fingerprint intensity charge state 1 Fingerprint mass/charge
Open Issues – Future Work ∙ Separating overlapping patterns
The Adaptive Isotope Wavelet Kernel - q denotes the Heaviside step function - λ(m) is a linear function fit to the averagine model