250 likes | 373 Views
Sharing the knowledge of electrophysiology data. Phillip Lord, Frank Gibson and the CARMEN Consortium. “In the standard model, one collects data, publishes a paper or papers and then gradually loses the original dataset.”
E N D
Sharing the knowledge of electrophysiology data Phillip Lord, Frank Gibson and the CARMEN Consortium
“In the standard model, one collects data, publishes a paper or papers and then gradually loses the original dataset.” THE NEW KNOWLEDGE ECONOMY AND SCIENCE AND TECHNOLOGY POLICYGeoffrey Bowker, University of California, San Diego
The need for clear metadata • Most neurosciences data is relative simple in structure • But often contextually complex • Sometimes associated with behavioural features
Neuroscience spike data • The raw data is normally a waveform • But, advances in instrumentation • High-throughput methods • But what is the experiment for? • What stimulus is the organism/tissue receiving? • Which channel is which • The data sets being produced are (reasonably) large (10’s of Gb, or 1Tb in three months)
http://en.wikipedia.org/wiki/Image:Brain_090407.jpg istockphoto.com Information Extraction • How do we get extract the information? http://en.wikipedia.org/wiki/Image:ATTtelephone-large.jpg
Multi-Author data From Katherine James, NCL
Author PMID Type Size 1 Davierwala et al 16155567 Synthetic_Lethality 627 2 Krogan et al 14759368 Affinity_Capture-MS 164 3 Hazbun et al 14690591 Affinity_Capture-MS 3210 4 Gavin et al 11805826 Affinity_Capture-MS 3596 5 Ho et al 11805837 Affinity_Capture-MS 733 6 Ito et al 11283351 Two-hybrid 275 7 Tong et al 11743205 Synthetic_Lethality 3411 8 Tong et al 14764870 Synthetic_Lethality 823 9 Uetz et al 10688190 Two-hybrid 1941 10 Miller et al 16093310 Two-hybrid 104 11 Lindstrom et al 12556496 Affinity_Capture-MS 134 12 Nissan et al 12374754 Affinity_Capture-MS 456 13 Grandi et al 12150911 Affinity_Capture-MS 150 14 Ohi et al 11884590 Affinity_Capture-MS 630 15 Krogan et al 14690608 Affinity_Capture-MS 370 16 Sanders et al 12052880 Affinity_Capture-MS 102 17 Baetz et al 14729968 Affinity_Capture-MS 258 18 Fromont-Racine et al 10900456 Two-hybrid 160 19 Fromont-Racine et al 9207794 Two-hybrid 182 20 Drees et al 11489916 Two-hybrid 232 21 Tong et al 11743162 Affinity_Capture-Western 125 22 Allen et al 11387327 Affinity_Capture-MS 116 23 Panse et al 15292183 Affinity_Capture-MS 181 24 Krogan et al 15353583 Affinity_Capture-MS 113 25 Kong et al 15563457 Protein-peptide 157 26 Hannich et al 15590687 Two-hybrid 134 27 Newman et al 11087867 Two-hybrid 464 28 Zhao et al 15766533 Reconstituted_Complex 125 29 Millson et al 15879519 Affinity_Capture-Western 369 30 Ubersax et al 14574415 Biochemical_Activity 138 31 Ingvarsdottir et al 15657441 Affinity_Capture-Western 175 32 Lesage et al 15166135 Synthetic_Lethality 292 33 Lesage et al 15715908 Synthetic_Lethality 323 34 Pan et al 15525520 Synthetic_Lethality 124 35 Loeillet et al 15725626 Synthetic_Lethality 214 36 Daniel et al 16157669 Synthetic_Lethality 4535 37 Pan et al 16487579 Synthetic_Growth_Defect 7076 38 Krogan et al 16554755 Affinity_Capture-MS 6531 39 Gavin et al 16429126 Affinity_Capture-MS 107 40 Milgrom et al 16118188 Synthetic_Lethality 215 41 Measday et al 16172405 Two-hybrid 477 42 Graumann et al 14660704 Affinity_Capture-MS 4179 43 Ptacek et al 16319894 Biochemical_Activity 103 44 Frazier et al 16476776 Co-fractionation 290 45 Ye et al 16729061 Synthetic_Rescue 3416 46 Schuldiner et al 16269340 Phenotypic_Enhancement 14421 47 Collins et al 17314980 Phenotypic_Enhancement 9064 48 Collins et al 17200106 Affinity_Capture-MS 117 49 Aronova et al 17507646 Co-fractionation 576 50 Wong et al 17634282 Two-hybrid 234
How do we represent… In silico Analysis Derived data Laboratory Experiments
View from microarrays Content Standard – Minimal Information MO -- Terminology MAGE -- Structure From the MGED society
The CARMEN approach Content Standard – Minimal Information about a Neuroscience Investigation OBI – Ontology for Biomedical Investigations FuGE -- Structure
Minimal Information About a Neuroscience Investigation:What do I have to tell you, for you to understand what I did? • Subdivided as; • Contact and context • Study subject • Recording location • Task • Stimulus • Behavioural event • Recording • Time series data Study inputs Assay inputs Assay procedures Data
MIAOWS • Describe essential metadata for your analysis code • What does it do? Objective • What type of input does it need • What type of output does it produce • If this information is not described, your code is of most value to yourself and much less value to the community
The CARMEN approach Content Standard – Minimal Information about a Neuroscience Investigation OBI – Ontology for Biomedical Investigations FuGE -- Structure
Functional Genomics Experiment (FuGE)How do I tell you, for you to understand? • Model of common components in science investigations, such as materials, data, protocols, equipment and software. • Provides a framework for capturing complete laboratory workflows, enabling the integration of pre-existing data formats.
Model Driven Architecture -- SyMBA UML XML Java objects database
FuGE community of users • MGED (transcriptomics) • Proteomics Standards Initiative • Metabolomics Standards Initiative (NMR and sample processing groups) • Genomics Standards Consortium (MIGS) • CARMEN, Code Analysis, Repository and Modelling for e-Neuroscience • Flow Informatics and Computational Cytometry Society • MIARE: Minimum Information About an RNAi Experiment
Functional Genomics Experiment (FuGE)How do I tell you, for you to understand? • Model of common components in science investigations, such as materials, data, protocols, equipment and software. • Provides a framework for capturing complete laboratory workflows, enabling the integration of pre-existing data formats.
The CARMEN approach Content Standard – Minimal Information about a Neuroscience Investigation OBI – Ontology for Biomedical Investigations FuGE -- Structure
OBI – Ontology of Biomedical Investigations Diversity communities from Nutrition to Metabolomics, from Environmental to genomics to Immunology, Imaging and Data analysis OBI branches: development work Protocol application branch Data Transformation branch Instrument branch Biomaterial branch Role branch Function branch Molecular entities Digital entity branch Adapted from Philippe Rocca-Serra, 2008
Summary • We are generating metadata “standards” for neurosciences • We are following a well-trodden path from bioinformatics • We adopted FuGE and have built MINI
Future Work • More neurosciences experimental datatypes. • Minimal Information about a Service • Describe analysis software as well as lab experiments. • Outreach!
Acknowledgements MINI: Frank Gibson, Paul G Overton, Tom V Smulders, Simon R Schultz, Stephen J Eglen, Colin D Ingram, Stefano Panzeri, Phil Bream, Evelyne Sernagor, Mark Cunningham, Christopher Adams, Christoph Echtermeyer, Jennifer Simonotto, Marcus Kaiser, Daniel C Swan, Martyn Fletcher, Phillip Lord CISBAN: Anil Wipat (PI), Allyson Lister (Research Associate), FuGE: The FuGE consortium OBI: The OBI consortium CARMEN: http://www.carmen.org.uk SyMBA: http://symba.sourceforge.net FuGE: http://fuge.sourceforge.net OBI: http://obi.sourceforge.net