FooDB & The Construction of an International Consortium

FooDB & The Construction of an International Consortium David Wishart University of Alberta, Edmonton, Canada The 3rd NUGO Workshop on Nutritional Metabolomics, July 1-2, Vlaardingen, The Netherlands

You Are What You Eat

150+ Food Composition DBs

The Problem with Today’s FCDBs • Highly distributed, not uniform in content, search capabilities or in user presentation • Breadth is good (lots of foods) depth is not (relatively few compounds) • Primary focus on AA’s, sugar, vitamins and general compound classes (i.e. “fats” or “lipids”), not on phytonutrients, micronutrients, aroma or flavour components • Were assembled using “old” technologies • Don’t relate composition data to chemistry, biology or physiology • Don’t provide data on nutrient metabolites or food consumption biomarkers

The Power of Metabolomics Unknowns 4 LC-MS or DI-MS 3 GC-MS TOF # Metabolites or Features detected (Log10) 2 NMR 1 Knowns GC-MS Quad 0 M mM M nM pM fM Sensitivity or LDL

The Power of Metabolomics Response Metabolomics Response Proteomics Response Genomics Time

Toxins/Env. Chemicals Drug metabolites Food additives/Phytochemicals Drugs Endogenous metabolites Human Metabolomes 2900 (T3DB) 1500 (DrugMet) 30000 (FooDB) 1450 (DrugBank) 8000 (HMDB) M mM M nM pM fM

The Food Metabolome Project • Experimentally characterize 3000+ metabolites in 30-40 representative raw, fermented and partially processed foods • Experimentally characterize 300+ food-derived metabolites in human blood & urine under controlled feeding conditions • Use data mining techniques to consolidate known food composition data, food metabolite data and food/health effects into a single “deep” web-accessible database • Combine all experimental & literature data into a database called FooDB $5 million over 3-5 years

Ideal Food Database Content

The Ideal Food Database • Searchable by name, text string, molecular weight, concentration range or structure • Given a compound name, what food products or plant species is it in and at what concentrations • Given a food product or plant species, what compounds are in it and at what concentrations • Given a compound, what are its (human) protein targets, transporters or mode of action • Given a compound, what are the (referenced) health claims and benefits • Given a compound, what are its metabolites

A Blended Model http://www.phenol-explorer.eu http://www.hmdb.ca

Challenge #1: Finding Food Compounds

8000 “housekeeping” metabolites in animals and plants 2000+ synthetic food additives 400,000+ plant species 7500 edible plant species 250 “common” edible plants (use this as a filter on phytoDBs) 13708 cmpds in DFC 8461 cmpds in Dr. Duke’s databases 1726 cmpds in KnapSack + KEGG 677 cmpds in Phenol Explorer Non-redundant = 19,767 cmpds 20,000 + 2000 + 8000 = 30,000 cmpds Counting Compounds

Challenge #2: Finding Phytochemical Structures Journals Web Books 10317 Manual 9450 structures 1077 OSRA Mol files SMILES InChI InChI key … .png 8373 Food compounds: 19767

Challenge #3: Annotating Cmpds • Use BioSpider to extract phys-chem data, formulas, names and/or synonyms, structures and descriptions • Modify databases it searches to include food component resources to extract concentration or content data • Calculate chemical class based on structure Prediction/Processing Engine

Challenge #4: Finding Health Claims & Targets • Text Mining software called PolySearch • Given X find all associated Y’s • Searches PubMed abstracts to extract and compile data about associations between compounds & disease or health claims as well as compounds and protein targets http://wishart.biology.ualberta.ca/polysearch

Sample Entry for Catechin

When Will It Be Done? Apr. 2009 Aug. 2009 Dec. 2009 Apr. 2010 Aug. 2010 Dec. 2010 Database Selection BioSpider Modification Compound Cleanup Structure Generation & Annotation Health Claim & Content data Release Jan. 1, 2011

Partnerships?

Industry Buy-in?

Acknowledgements Augustin Scalbert Vanessa Neveu Craig Knox Roman Eisner Edison Dong Paul Huang Russ Greiner Liang Li Hans Vogel The HMP & Panamp Team

FooDB & The Construction of an International Consortium