230 likes | 261 Views
Chemistry. Biology. Welcome! Mass Spectrometry meets Cheminformatics WCMC Metabolomics Course 2013 Tobias Kind Course 6: Concepts for LC-MS. Informatics. http://fiehnlab.ucdavis.edu/staff/kind. CC-BY License. 100. MS2. 50. 0. 200. 400. 600.
E N D
Chemistry Biology Welcome! Mass Spectrometry meets Cheminformatics WCMC Metabolomics Course 2013 Tobias Kind Course 6: Concepts for LC-MS Informatics http://fiehnlab.ucdavis.edu/staff/kind CC-BY License
100 MS2 50 0 200 400 600 General LC-MS data processing for small molecules Confirm with MS/MS or MSn fragmentation
Deconvolution and evaluation of LC-MS data LC-MS run 40 minutes C8 column, Agilent-UPLC Chlamydomonas extract LC-MS Chromatogram Picture: nrel.gov + Extracted Mass spectrum FT-ICR-MS mass spectrum MS1 @ 50,000 resolving power Check charge state = 1 756.57 represents [M+H]+ Chromatogram Source: N.Saad, DY Lee FiehnLab
Deconvolution and evaluation of LC-MS data Example with HighChem Mass Frontier LC-MS detected compound Marked with blue triangle LC-MS detected compound Marked with blue triangle 141 peaks extracted Extracted MS1 peak Library search useless (only single peak)
Adduct removal and detection during ESI-LC-MS runs [M+H]+ = 756.577 M = 755.5627 Download Adduct-Calculator [LINK]
Problem: Detection of molecular ion Problem: Is this the pure mass spectrum or from overlapping peaks? Is it M+H or M+Na or any of the 40 other adducts? Example data from crocin standard mixture (expected MW: 976.965) [M+Na]+ CID = 0V CID = 100V [M+Na]+
Adduct removal and detection during LC-MS runs Adduct can be removed before or after formula generation. For good isotopic pattern matching remove adduct after formula generation. 7 Golden Rules apply LEWIS and SENIOR check (adduct needs to be removed)
Formula Generation from accurate mass measurement CHN5O13P15 or CHN11P19 ??? Apply Seven Golden Rules for correct molecular formulas Apply heuristic and mathematical and chemometric rules for filtering elemental compositions
Isotopic Pattern generator from formula Example from MWTWIN Is usually included in every LC-MS software
Isotopic pattern equally important as accurate mass Experimental result Abundances for all molecular formulae A+1 = 47.44% A+2 = 10.92 % We can discard all other results outside the error box. Current box reflect +/- 10% error.
48 36 24 12 0 10.80 10.90 11.00 11.10 11.20 11.30 11.40 11.50 11.60 11.70 11.80 11.90 Problems during LC-MS peak detection and MS deconvolution Multiple peaks detected But it is a single component only • Multiple peaks detected – Solution: adjust deconvolution settings • Mass spectra not clean – Solution: manual peak extraction • Not enough peaks detected – Solution: increase signal noise (S/N) settings • Finding optimum settings is: • non-trivial and can change in different matrices • can be evaluated on standards and quality check mixtures • can be obtained by self-sharpening algorithms
Mass Frontier 5.0 Report File MS FTMS + p NSI Full #1414 1 100 756.577 50 0 755 760 File MS 2 ITMS + c NSI d Full 756.58 #1415 100 478.45 496.46 50 335.56 0 200 400 600 100 100 478.45 496.46 756.577 75 75 50 50 25 25 236.20 0 434.52 271.06 335.56 391.34 687.72 0 751 752 753 754 755 756 757 758 759 760 761 762 250 300 350 400 450 500 550 600 650 700 UPLC-FT-MS data extraction with MassFrontier Fragment peaks m/z 478.45 and 496.46 MS1 MS2 MS2 MS1 Approach: generate molecular formula using Seven Golden Rules; find matching isomers in molecular databases; confirm possible matches by in-silico fragmentation (usually impossible);
Seven Golden Rules – generate possible molecular formulas 5 formula candidates left with 30 ppm mass accuracy and 10% isotopic abundances These are candidates with good isotopic pattern match. These 5 were found in PubChem. C42H78NO8P - 1 isomer hit C42H77NO10 - 1 isomer hit C39H73N5O9 - 0 isomer hit C43H82NO7P - 2 isomers found C43H73N5O6 -2 isomers found C45H77N3O6 - 1 hit found C45H69N7O3 - 1 hit found Scan speed problem: Due to poor ion statistics only few scans are collected Mass accuracy and isotopic abundance accuracy are bad
In-silico fragmentation with MassFrontier using fragmentation library of 20,000 mechanisms from literature
In-silico fragmentation with MassFrontier Experimental peaks m/z 478.45 and 496.46 were detected in MS/MS spectrum In-silico fragmentation should match the experimental fragmentation.In-silico - using a computer library of 20,000 fragmentation rules from the MS literature C42H78NO8P – Fragments N.A. C42H77NO10 – Fragments N.A. C45H69N7O3 – Fragments: m/z 496 C43H82NO7P – Fragments m/z 478 and 496 LWHKEEJOPDYARM-WYCAKVPNBV C43H82NO7PH – Fragments N.A. Possible solution (2 fragments match)
Discussion of general LC-MS approach Example discussion:Fragment at m/z 236 not explained; molecular ion may be wrong; Substance can be potential new compound or not in database Must be confirmed by NMR or external standard General problems: Best approach is to generate MS and MSn and MSe mass spectral libraries Adduct removal is a problem Building target lists is always good (know what to expect) Focus on certain substance classes only Focus on single compound only Substance must be known for in-silico approach Fragmentation rules must be captured for in-silico approach In-silico approaches work best for peptides, carbohydrates, lipids (due to known and stable fragmentations) Importance: Taxonomics species-compounds relationship databases or pathway DBs KNApSAcK/ database, KEGG, MetaCyc
All theory is lost – if compound is truly unknown or not in public database Rank 147 in 7GR with original settings C46H77NO7 not found in ChemSpider, PubChem, LipidMaps (2013) Only found in internal LipidBlast database. UPLC-FTICR settings readjusted to 3 ppm mass accuracy and 5% isotopic abundance error Actual mass error is: -1.718266 ppm Max. isotopic abundance error: 4.10 0% Additional evidence, there is not PC in Chlamydomonas, but mostly the betaine lipid DGTS DGTS accounts for about 10% of total lipid in Chlamydomonas; Peter Schlapfer, Waldemar Eichenberger Plant Science LettersVolume 32, Issues 1-2, October-November 1983, Pages 243-252 and FEBS Letters, Volume 88, Issue 2, 15 April 1978, Pages 201–204; W. Eichenberger , A. Boschetti Rank 18 in 7GR with original settings (from 56)
LipidBlast comes to help unravel the mystery best tentative structure proposal so far Experimental MS/MS m/z 236 instead of m/z 184 DGTS instead of PC Precursor m/z 756.576500 (M+H)+ LipidBlast in-silico MS/MS Name: DGTS 36:6; [M+H]+; DGTS(18:3/18:3) MW: 756 Exact Mass: 756.57782 ID#: 9031 DB: dgts+hpos-it.msp Comment: Parent=756.57782 Mz_exact=756.57782 ; DGTS 36:6; [M+H]+; DGTS(18:3/18:3); C46H77NO7 7 m/z Values and Intensities: 236.14979 200.00 C10H21NO5H+ (236.14) 478.35339 600.00 [M+H]-sn1-H2O || [M+H]-sn2-H2O 496.36395 600.00 [M+H]-sn1 || [M+H]-sn2 625.48320 400.00 [M+H]-131 639.49885 250.00 [M+H]-117 738.56726 999.00 [M-H2O]+ 756.57782 10.00 [M+H] Betaine lipid DGTS(18:3/18:3) Kind T, Liu KH, Lee do Y, Defelice B, Meissen JK, Fiehn O. Nature Methods. 2013 Aug;10(8):755-8. doi: 10.1038/nmeth.2551. Epub 2013 Jun 30. LipidBlast in silico tandem mass spectrometry database for lipid identification.
The Last Page - What is important to remember: Always use peak picking and mass spectral deconvolution for LC-MS data Apply accurate mass, accurate isotopic abundances together for formula generation Make use of high resolving power whenever possible Use MS/MS data and mass spectra from different ionization voltages Use existing MS/MS libraries or create your own MSn tree libraries Use molecular isomer databases for obtaining possible structure candidates Confirm if possible with MSn data or other possible filter constraints Validation, validation, validation: Pipelines and workflows must be validated with unknown (unknown) compounds Compound is tentative until finally approved with reference standard or matching with multiple orthogonal parameters such as MS/MS, retention time or NMR.
Recent literature (2012/2013) Applying Tandem Mass Spectral Libraries for Solving the Critical Assessment of Small Molecule Identification (CASMI) LC/MS Challenge 2012; Oberacher H; Metabolites2013, 3(2), 312-324; doi:10.3390/metabo3020312 Computational mass spectrometry for small molecules; J Cheminform. 2013; 5: 12.; Kerstin Scheubert, Franziska Hufsky, Sebastian Böcker Published online 2013 March 1. doi: 10.1186/1758-2946-5-12 A Rough Guide to Metabolite Identification Using High Resolution Liquid Chromatography Mass Spectrometry in Metabolomic Profiling in Metazoans; Computational and Structural Biotechnology Journal ;David Watson; Volume No: 4, Issue: 5, January 2013, e201301005, http://dx.doi.org/10.5936/csbj.201301005 Bioinformatics. 2011 Apr 15;27(8):1108-12. doi: 10.1093/bioinformatics/btr079. Epub 2011 Feb 16. Brown M, Wedge DC, Goodacre R, Kell DB, Baker PN, Kenny LC, Mamas MA, Neyses L, Dunn WB. Automated workflows for accurate mass-based putative metabolite identification in LC/MS-derived metabolomic datasets. {PUTMEDID_LCMS} IDEOM: an Excel interface for analysis of LC–MS-based metabolomics data; Bioinformatics. 2012 Apr 1;28(7):1048-9. doi: 10.1093/bioinformatics/bts069. Epub 2012 Feb 4. IDEOM: an Excel interface for analysis of LC-MS-based metabolomics data. Creek DJ, Jankevics A, Burgess KE, Breitling R, Barrett MP. Anal Chem. 2012 Jan 3;84(1):283-9. doi: 10.1021/ac202450g. Epub 2011 Dec 12. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. Kuhl C, Tautenhahn R, Böttcher C, Larson TR, Neumann S. Mass appeal: metabolite identification in mass spectrometry-focused untargeted metabolomics; March 2013, Volume 9, Issue 1 Supplement, pp 44-66; Warwick B. Dunn, Alexander Erban, Ralf J. M. Weber, Darren J. Creek, Marie Brown, Rainer Breitling, Thomas Hankemeier, Royston Goodacre, Steffen Neumann, Joachim Kopka, Mark R. Viant Metabolite profiling and beyond: approaches for the rapid processing and annotation of human blood serum mass spectrometry data; Metabolomics and Metabolite Profiling Analytical and Bioanalytical Chemistry June 2013, Volume 405, Issue 15, pp 5037-5048 [LINK] Anal Chem. 2013 Apr 2;85(7):3576-83. doi: 10.1021/ac303218u. Epub 2013 Mar 21. Automated pipeline for de novo metabolite identification using mass-spectrometry-based metabolomics. Peironcely JE, Rojas-Chertó M, Tas A, Vreeken R, Reijmers T, Coulier L, Hankemeier T. http://fiehnlab.ucdavis.edu/staff/kind/Metabolomics/Structure_Elucidation/