850 likes | 1.19k Views
EMBO Practical Course on Metabolomics Bioinformatics for Life Scientists. “Dissecting an untargeted metabolomic workflow”. Oscar Yanes, PhD. Untargeted metabolomics workflow. Sample preparation. Experimental design. Sample analysis by MS and NMR. Pre-processing data analysis.
E N D
EMBO Practical Course on Metabolomics Bioinformatics for Life Scientists “Dissecting an untargeted metabolomic workflow” Oscar Yanes, PhD
Untargeted metabolomics workflow Sample preparation Experimental design Sample analysis by MS and NMR Pre-processing data analysis Metabolite identification Experimental validation Hypothesis
Untargeted metabolomics workflow Sample preparation Experimental design Sample analysis by MS and NMR Pre-processing data analysis EMBO Course Metabolite identification Experimental validation Hypothesis
Ultimate goal of metabolomics List of metabolites differentially regulated Biomarker discovery Pathway analysis Model construction Scientific literature Disease vs. control Mechanism Hypothesis Validation
Untargeted metabolomics workflow Sample preparation Experimental design Sample analysis by MS and NMR Pre-processing data analysis Metabolite identification Experimental validation Hypothesis
THE IMPORTANCE OF EXPERIMENTAL DESIGN I want to do metabolomics ME COLLABORATOR
THE IMPORTANCE OF EXPERIMENTAL DESIGN … I want to do metabolomics ME COLLABORATOR
THE IMPORTANCE OF EXPERIMENTAL DESIGN I have many samples at -80°C. Could you do metabolomics and find out something? ME COLLABORATOR
THE IMPORTANCE OF EXPERIMENTAL DESIGN I have many samples at -80°C. Could you do metabolomics and find out something? !! ME COLLABORATOR
BASIC DIAGRAM OF A MASS SPECTROMETER Gas-phase: Gas chromatography Liquid-phase: Liquid chromatography Capillary electrophoresis Solid-phase: Surface-based
BASIC DIAGRAM OF A MASS SPECTROMETER Electron ionization (EI) Chemical ionization (CI) Atmospheric pressure chemical ionization (APCI) Electrospray ionization (ESI) Laser desorption ionization (LDI)
Untargeted metabolomics workflow Sample preparation Experimental design Sample analysis by MS Pre-processing data analysis Metabolite identification Experimental validation Hypothesis
Requisite for untargeted metabolomics Maximize ionization efficiency over the whole mass range (e.g., m/z 80-1500)
Requisite for untargeted metabolomics Maximize ionization efficiency over the whole mass range (e.g., m/z 80-1500) Number of features Intensity of the features
Requisite for untargeted metabolomics Maximize ionization efficiency over the whole mass range (e.g., m/z 80-1500) Number of features Intensity of the features Coverage of the metabolome Accurate quantification and identification of metabolites
How do we increase the number of features and their intensity?? intensity mass time Feature: molecular entity with a unique m/z and retention time value
How do we increase the number of features and their intensity?? intensity mass time Sample preparation: - Extraction method Chromatography: - Stationary-phase - Mobile-phase Ion Funnel Technology etc.
Extraction method Hot EtOH/Amm. Acetate Cold Acetone/MeOH Only 45% of the metabolites are detected with Acetone/MeOH MS/MS threshold
Extraction method Yanes O., et al. Anal. Chem. 2011; 83(6):2152-61
Liquid Chromatography: mobile-phase Ammonium Fluoride Ammonium acetate Formic acid Yanes O et al. Anal. Chem. 2011; 83(6):2152-61
Ammonium fluoride Ammonium acetate F- Ammonium fluoride
Chromatography: stationary phase HILIC RP C18/C8 Effect of pH; ammonium salts; ion pairs (e.g. TBA) LC flow rate and pressure: UPLC vs. HPLC vs. nanoLC (vs. GC!) HPLC UPLC Minutes Minutes
BASIC DIAGRAM OF A MASS SPECTROMETER Electron ionization (EI) Chemical ionization (CI) Atmospheric pressure chemical ionization (APCI) Electrospray ionization (ESI) Laser desorption ionization (LDI)
PRACTICAL ASPECTS • Number of scans/second • Implications in LC/MS and GC/MS: • Quantification • Maximum intensity or integrated area • Instrument resolution • Implications: • Detector saturation • Quantification • 3. Sample amount injected • Implications: • Detector saturation
Untargeted metabolomics workflow Sample preparation Experimental design Sample analysis by MS and NMR Pre-processing data analysis EMBO Course Metabolite identification Experimental validation Hypothesis
FROM RAW DATA TO METABOLITE IDs METABOLITE IDENTIFICATIONS STATISTICAL ANALYSIS PRE-PROCESSING RAW DATA CONVERSION
FROM RAW DATA TO METABOLITES IDs METABOLITE IDENTIFICATIONS LC/MS GC/MS RAW DATA CONVERSION PRE-PROCESSING STATISTICAL ANALYSIS LC/MS GC/MS PATHWAY ANALYSIS
LC-MS WORKFLOW LC-MS RAW DATA PROTEOWIZARD mZDATA PREPROCESSING mZRT Features Table Feature: individual ions with a unique mass-to-charge ratio and a unique retention time STATISTICAL ANALYSIS IDENTIFICATION
LC-MS WORKFLOW RAW LC-MS DATA TO mZXML: PROTEOWIZARD [Nature Biotechnology, 30 (918–920) (2012)]
LC-MS WORK-FLOW XCMS PRE-PROCESSING • http://metlin.scripps.edu/download/ • Free & Open Source • Based on R • On-line version • Suitable for: • -GC-MS • -LC-MS Analytical Chemistry, 78(3), 779–787, 2006 Analytical Chemistry, 84(11), 5035-5039, 2012
LC-MS WORKFLOW XCMS PRE-PROCESSING 1. FEATURE DETECTION [BMC Bioinformatics, 2008 9:504]
LC-MS WORKFLOW XCMS PRE-PROCESSING 1. FEATURE DETECTION 1. Dense regions in m/z space 2. Gaussian peak shape in chromatogram
LC-MS WORK-FLOW XCMS PRE-PROCESSING 2. RETENTION TIME CORRECTION
LC-MS WORKFLOW • 103-104 mZRT features IDENTIFICATION NOT FEASIBLE! • features redundancy: • -adducts: [M+H+], [M+Na+], [M+NH4+], [M+H+-H2O]… • -isotopes: [M+1], [M+2], [M+3] • Many mZRT features are noisy in nature and irrelevant to our phenomea STATISTICAL ANALYSIS FEATURES RANKING Those features varying according to our phenomena are retained to further identification experiments
WORKLIST LC-MS WORK-FLOW FEATURES RANKING CRITERIA (I) ANALYTICAL VARIABILITY -RANDOMIZE -USE QCs TO CHECK ANALYTICAL VARIATION
LC-MS WORK-FLOW FEATURES RANKING CRITERIA (I) ANALYTICAL VARIABILITY
USEFUL PLOTS IN EXPLORATORY DATA ANALYSIS NEURONAL CELL CULTURES KO (N=15) vs WT (N=11) #mZRT=6831 RETINAS Hypoxia (N=12) vs Normoxia (N=13) #mZRT=7654
#features=4704 LC-MS WORK-FLOW FEATURES RANKING CRITERIA (IV) HYPOTHESIS TESTING+FDR =0.05 (235 features significantly varied by chance, 26% out of 900) FDR=0.0074 (20 features varied by chance, 5% out of 404)
NEURONAL CELL CULTURES KO (N=15) vs WT (N=11) #mZRT=6831 RETINAS Hypoxia (N=12) vs Normoxia (N=13) #mZRT=7654 USEFUL PLOTS IN EXPLORATORY DATA ANALYSIS
NEURONAL CELL CULTURES KO (N=15) vs WT (N=11) #mZRT=6831 RETINAS Hypoxia (N=12) vs Normoxia (N=13) #mZRT=7654 USEFUL PLOTS IN EXPLORATORY DATA ANALYSIS
10M data points # mZRT=51908 (i) analytical variability # mZRT=38377 (ii) features intensity # mZRT=4704 (iii) hypothesis testing + fold change # mZRT=250 Annotation Data Base look-up Identification experiments 10-50 differential metabolites LC-MS WORKFLOW
Workflow for Metabolite Identification Step 1: Select interesting features Step 2: Search databases for accurate mass Step 3: Filter “putative” identification list Step 4: Compare RT and MS/MS of standards
Workflow for Metabolite Identification Step 1: Select interesting features Step 2: Search databases for accurate mass Step 3: Filter “putative” identification list Step 4: Compare RT and MS/MS of standards
Workflow for Metabolite Identification Step 1: Select interesting features Step 2: Search databases for accurate mass Step 3: Filter “putative” identification list Step 4: Compare RT and MS/MS of standards
Metlin Step 2: Search databases for accurate mass Each feature returns many hits. HMDB