1 / 22

Peak Detection with Chemical Noise Removal Using Short-Time FFT for a Kind of MALDI Data

Peak Detection with Chemical Noise Removal Using Short-Time FFT for a Kind of MALDI Data. Xiaobo Zhou HCNR-CBI, Harvard Medical School and Brigham & Women’s Hospital, Boston, MA. MS Profiling and Biomarkers. Biosci Rep 25(1-2):107. MALDI-TOF Mass Spectrometry. Biochemistry @ NCBI Books.

shina
Download Presentation

Peak Detection with Chemical Noise Removal Using Short-Time FFT for a Kind of MALDI Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Peak Detection with Chemical Noise Removal Using Short-Time FFT for a Kind of MALDI Data Xiaobo Zhou HCNR-CBI, Harvard Medical School and Brigham & Women’s Hospital, Boston, MA.

  2. MS Profiling and Biomarkers Biosci Rep 25(1-2):107

  3. MALDI-TOF Mass Spectrometry Biochemistry @ NCBI Books

  4. Outline • Introduction • Peak detection method • Random noise removal • Chemical noise removal • Peak identification • Results • Conclusion

  5. Introduction • Peak detection is the first step in the biomarker extraction from the MS data. • Proper peak detection methods should be based on theproperties of different types of MS data. • Properties of theMALDI (Matrix Assisted Laser Desorption Ionization) data interested: • high mass accuracy and resolution over a wide mass range: m/z can be as large as 20000 Da. • the unique noise pattern: low frequency noise peaks at 1 Da apart.

  6. Example of a spectrum • Fig. 1 The largest m/z is more than 9000Da. The two enlarged regions show the detailed signals when m/z is relatively small and large in this spectrum. When m/z is small, the pattern of the chemical noise can be seen clearly, while when m/z is becoming larger, it is weaker. Small m/z Large m/z

  7. Example of chemical noise • Fig. 2 Part of a raw spectrum. The chemical noise has a unique pattern that peaks at 1 Da apart.

  8. Existed methods • Most methods: • Based on the a priori knowledgeof the massof the peptides • Drawback: effective form/z less than 4000Da • Yu W. et al. : • Gaussian filter to smooth the signal • Gabor quadrature filter to extract the envelope signal • Drawback: the threshold of defining peaksis got via an empirical approach without considering the noise model. Some small peaks can be overlooked / omitted. • Jurgen K. et al: • To remove chemical noise by Fourier transform with fixed window size. • Drawback: useful for the eletrospray quadrupole TOF data, whose maximum m/z is less than 2000Da.

  9. Peak detection method • Proposed model: :number of the spectrum :number of the time of flight (TOF) point :the true intensity :the observed intensity :chemical noise :random noise

  10. Three steps • Random noise removal undecimated wavelet transform to get the smoothed signal software: Rice Wavelet Toolbox (RWT) website: http://www-dsp.rice.edu/software/rwt.shtml • Chemical noise removal short time discrete Fourier transform with adaptive window size to get an estimate of chemical noise • Peak identification • Normalization • Threshold Approach via SNR • Identification

  11. Chemical noise removal • Why STDFT? • The shape of the chemical noise is similar to the sinusoidal signal. • The number of points in a 1Da region is different and decreasing from lower m/z to higher m/z. • Short Time Discrete Fourier Transform • Choose the window size the region with the same number of points in 1Da (figure in the following page)

  12. Window size of STDFT • Fig. 3 Approximate number of data points (window size) in 1Da region as a function of m/z. With the increasing of m/z, the region with same number of time point is becoming larger.

  13. Estimate of the chemical noise Small m/z Large m/z • Fig. 4 Estimate of the chemical noise for the two enlarged regions in Fig. 1. Black curve: smoothed signal; Blue curve: estimate of the chemical noise. If the smoothed signal minus the estimate of the chemical noise is less than a certain threshold, it is set to be zero.

  14. Peak identification • Normalization of the processed spectrum The spectrum is divided by the mean intensity of the processed spectrum to correct the systematic differences • Definition of the Signal-to-noise ratio (SNR) • Peak identification The peak is located in the position where it has the highest intensity in a cluster of peaks

  15. Peak identification • Fig. 5 Examples of envelope extraction and peak detection. The blue curves show the envelope extraction. Each concave part of the curve corresponds to the same protein. The red circles label thepeaks detected by our method. Small m/z Large m/z

  16. Results: • Data sets -The data got from the serum of a cohort of patients undergoing endarterctomy (EA) (therapy) for occluded carotid arteries (disease). -Here we use the data of a group of all the patients: the symptomatic EA patients (EAS). -After the sample preparation and the data acquisition, we finally got 35 spectra.

  17. Numerical experiment settings • The threshold of wavelet transform: 6 • The threshold of chemical noise removal: 80 percentiles of the value got by subtracting the estimated chemical noise from the smoothed signal in each adaptive region • We define true peaks: the number of peaks in one location is more than 15% of the number of spectra in one group in the same 1Da region. • Position of the peaks: the mean m/z over all the peaks in 1 Da.

  18. Peaks in the EAS group • Fig. 6 All peaks detected in EAS group when SNR is 2. One bad spectrum is removed. Totally 34 spectra are considered.

  19. SNR comparison • Fig. 7 Comparison of SNR between the same spectrum without and with removal of chemical noise. Left: the SNR got without chemical noise removal. Right: SNR got with chemical noise removal.

  20. Conclusions • Proposed an automatic peak detection method for a kind of MALDI data: chemical noise removal to increase the SNR of some small peaks. • Future works: • Peak alignment: to align all the peaks corresponding to the same protein but with shifting • Feature extraction: to select the biomarkers

  21. References • Berndt, P., Hobohm, U., Langen, H., Reliable automatic protein identification from matrix-assisted laser desorption ionization mass spectrometric peptide finger prints, Electrophoresis, 1999, 20, 3521-3526. • Breen, E. J, Hopwood, F.G., Williams, K.L., Wilkins, M. R., Automatic poisson peak harvesting for high throughput protein identification, Electrophoresis, 2000, 21, 2243-2251. • Coombes, K. R., Tsavachidis, S., Morris, J. S., Baggerly, K. A., Hung, M. C., H. M. Kuerer, PImproved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform, Proteomics, 2005, 5, 4107-4117. • Du, P., Kibbe, W. A., Lin, S. M., Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching, Bioinformatics, 2006, 22, 2059-2065. • Gras, R., Muller, M., Gasteiger, E., Gay, S., Binz, P. A., Bienvenut, W., Hoogland, C., Sanchez, J. C., Bairoch, A., Hochstrasser, D. R., Appel, R. D., Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection,Electrophoresis, 1999, 20, 3535-3550.

  22. References • Jurgen, K., Marc, G., Keith, R., Matthias, W., Noise filtering techniques for electrospray quadrupole time of flight mass spectra, J. Am. Soc. Mass Spectrom., 2003, 14, 766-776. • Krutchinsky, A. N., Chait, B. T., On the nature of the chemical noise in MALDI mass spectra, Journal American Society for Mass Spectrometry, 2002, 13, 129-134. • Morris, J. S., Coombes, K. R., Koomen , J., Baggerly, K. A., Kobayashi, R., Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum, Bioinformatics, 2005, 21, 1764-1775. • Samuelsson, J., Dalevi, D., Levander, F., Rognvaldsson, T., Modular, scriptable and automated analysis tools for high-throughput peptide mass fingerprinting, Bioinformatics, 2004, 20, 3628-3635. • Yasui, Y., McLerran, D., Adam, B., Winget, M., Thornquist, M., Feng, Z., An automated peak-identification / calibration procedure for high-dimensional protein measures from mass spectrometers, Journal of Biomedicine and Biotechnology, 2003, 4, 242-248. • Yu, W., Wu, B., Lin, N., Stone, K., Williams, K., Zhao, H., Detecting and Aligning Peaks in Mass Spectrometry Data with Applications to MALDI, Computational Biology and Chemistry, 2006, 30, 27-38.

More Related