440 likes | 585 Views
Dealing With the Unknown. Metabolomics & Metabolite Atlases. Ben Bowen Pathway Tools Workshop 2010. Acknowledgements. Trent Northen Richard Baran Wolfgang Reindl Do Yup Lee Jane Tanamachi Jill Banfield Curt Fisher Paul Wilmes
E N D
Dealing With the Unknown Metabolomics & Metabolite Atlases Ben Bowen Pathway Tools Workshop 2010
Acknowledgements Trent Northen Richard Baran Wolfgang Reindl Do Yup Lee Jane Tanamachi Jill Banfield Curt Fisher Paul Wilmes US Department of Energy BER Genome Sciences Program
LC-MS/MS Workflow metabolite solvent extraction HPLC (C18; hilic) Sample independent: suitable for unsequenced organisms and communities C18NEG/255.22807/3.39329/Hexadecanoic acid; C18NEG/255.22862/4.89002/Hexadecanoic acid; C18NEG/248.8424/1.47135/24-Dibromophenol; C18NEG/112.98576/27.34079/Acetylenedicarboxylate; C18NEG/270.82471/1.34821/ C18NEG/168.88735/1.29241/ Metabolite ‘features’ & Quantification AGILENT 6520 QTOF MS/MS
How a data point becomes a compound • From Feature to Formula • Selection of features • Pure Spectra • Isotopic pattern fitting • Stable Isotope Labeling From Formula to Compound Photo: John Waterbury, Woods Hole Oceanographic Institute (DOE) • Exact Match to MS/MS Spectra • Partial Match to MS/MS Spectra • Exchangable hydrogen • Retention time • Authentic standards • Other (NMR & Synthesis) Annotation of Metabolite Atlases • Define feature in database • Sample Metadata • Extraction methods • LC/MS methods • mz@rt annotations
Systems biology depends on accurate models Analysis of MetaCyc shows many unique formulas are shown in only a few reactions or pathways Pathway Specific Markers Or Sparsity of Knowledge • Models provide a framework to prove or disprove observations. • Highlight gaps in annotations when new compounds are discovered
Using inexact mass for formula ID C & N Isotopic Labels Isotopic Pattern Fitting Reduce Degeneracy About m/z value
Mass and Degeneracy are Correlated Heuristically Filtered Brute Force Method
Large-scale formula determination using stable isotopic labeling PROBLEM: Difficult to ID many metabolites give low coverage of authentic standards Approach: Stable isotope labeling (SIL) for direct empirical formula determination CONTROL Na15NO3 Baran et. al. Untargeted metabolite profiling of Synechococcus sp. PCC 7002 reveals a large fraction of unexpected metabolites (Analytical Chemistry 2010) NaH13CO3
Less Degeneracy Isn’t Better We Prefer to Work With Unique Chemical Formulae Heuristically Filtered Only Unfiltered + SIL Heuristically Filtered + SIL
Initial focus is on Synechococcus sp a simple yet important model system Simple system For method development Widely distributed and globally important in carbon cycling Photosynthetic bacteria Small genome (3299 ORFs) ~fast growing and easy to grow No metabolite background (salt media) Adaptable: 0-2M salt, T up to 45C
Benefits of Using SIL • Are the signals being measured biological? • What type of ion is the signal? • Has this signal been seen before? • What compound(s) is it? • What else in the sample behaves like that compound?
Control 15N 13C Stable isotope labeling [15N]NaNO3 [13C]NaHCO3
m/z RT
Non-biological features dominate • Manually curated • Computationally Identified • Sets are constructed by grouping features by retention time
Results • ~100 distinct metabolites detected • 82 assigned chemical formulas • 74 unique • 45 outside of Syn7002Cyc • 24 outside of MetaCyc or KEGG • 54 identified or putatively identified metabolites • Using authentic standards or MS/MS
Most dominant biological features Putative hexose(amine)-based trisaccharide:
O O O N N N HO HS OH OH OH NH NH NH N N N Histidine-betainederivatives • Previously only to attributed to non-yeast-fungi and Actinomycetalesbacteria • Culture purity validated by PCR of markers of ribosomal RNA and sequencing
N2-acetyllysine Lysine biosynthesis VI (Syn7002Cyc) Lysine biosynthesis V (Syn7002Cyc)
Analyze selected features by MS/MS Target features at specific m/z & r.t.
MS/MS structural confirmation • Commercial Standards • Metlin • Massbank • Collaborating to expand the number of authentic standards (Siuzdak, Mukhopadhyay) and make these publically available.
De novo MS/MS analysis 5-methyluridine
Proton Painting CiHjOkNxPySz Ci(HNj1HEXj2)OkNxPySz j=j1+j2
Chemical properties in addition to m/z decyldimethylammoniopropanesulfonate Glycylglycine
Lipids from microbial communities • Unlabeled • 15N labeled • 2H labeled (exchangeable) • Sample independent
Absolute abundance of L-PE features is much higher in a “friable” sample. AB Muck DS2 AB Muck Friable
Relative abundance of various PEs changes with development stage.
Moving from features to formulas to metabolites is challenging m/z 205.097 Chemical formula determination Time (sec) C11H12N2O2 Structural analysis
Retention Time Correlation After 12 Observations
SIL Automatic Annotation Test the fit for all possible formulas for common ionization mechanisms Label Purity and Percent Incorporation are Parameters
Correlation and mass defect analysis C2H4 C2H4
Modular Metabolome Autocorrelation Spectra of unprocessed data H2O Find the dominant mass differences in data
Estimate the likelihood of all possible chemical differences How can you know that this is CH2?
What can be resolved Mass of an electron shown for scale
Time and Mass Correlation C2H4: Positive Time Correlation Neutron: Zero Time Correlation H2O: Mixture of: Zero Time and Negative Time Correlation
Microbial Metabolite Atlases From Features to Pure Spectra Within one experiment: 1000s of features from 100s of metabolites