250 likes | 333 Views
Metadata For CARMEN. Phillip Lord and Frank Gibson. Problems. “In the standard model, one collects data, publishes a paper or papers and then gradually loses the original dataset.” THE NEW KNOWLEDGE ECONOMY AND SCIENCE AND TECHNOLOGY POLICY Geoffrey Bowker, University of California, San Diego.
E N D
Metadata For CARMEN Phillip Lord and Frank Gibson
Problems • “In the standard model, one collects data, publishes a paper or papers and then gradually loses the original dataset.” • THE NEW KNOWLEDGE ECONOMY AND SCIENCE AND TECHNOLOGY POLICYGeoffrey Bowker, University of California, San Diego
The need for clear metadata • Most neurosciences data is relative simple in structure • But often contextually complex • Sometimes associated with behavioural features
http://en.wikipedia.org/wiki/Image:Brain_090407.jpg istockphoto.com Information Extraction • How do we get extract the information? http://en.wikipedia.org/wiki/Image:ATTtelephone-large.jpg
Multi-Author data From Katherine James, NCL
Author PMID Type Size 1 Davierwala et al 16155567 Synthetic_Lethality 627 2 Krogan et al 14759368 Affinity_Capture-MS 164 3 Hazbun et al 14690591 Affinity_Capture-MS 3210 4 Gavin et al 11805826 Affinity_Capture-MS 3596 5 Ho et al 11805837 Affinity_Capture-MS 733 6 Ito et al 11283351 Two-hybrid 275 7 Tong et al 11743205 Synthetic_Lethality 3411 8 Tong et al 14764870 Synthetic_Lethality 823 9 Uetz et al 10688190 Two-hybrid 1941 10 Miller et al 16093310 Two-hybrid 104 11 Lindstrom et al 12556496 Affinity_Capture-MS 134 12 Nissan et al 12374754 Affinity_Capture-MS 456 13 Grandi et al 12150911 Affinity_Capture-MS 150 14 Ohi et al 11884590 Affinity_Capture-MS 630 15 Krogan et al 14690608 Affinity_Capture-MS 370 16 Sanders et al 12052880 Affinity_Capture-MS 102 17 Baetz et al 14729968 Affinity_Capture-MS 258 18 Fromont-Racine et al 10900456 Two-hybrid 160 19 Fromont-Racine et al 9207794 Two-hybrid 182 20 Drees et al 11489916 Two-hybrid 232 21 Tong et al 11743162 Affinity_Capture-Western 125 22 Allen et al 11387327 Affinity_Capture-MS 116 23 Panse et al 15292183 Affinity_Capture-MS 181 24 Krogan et al 15353583 Affinity_Capture-MS 113 25 Kong et al 15563457 Protein-peptide 157 26 Hannich et al 15590687 Two-hybrid 134 27 Newman et al 11087867 Two-hybrid 464 28 Zhao et al 15766533 Reconstituted_Complex 125 29 Millson et al 15879519 Affinity_Capture-Western 369 30 Ubersax et al 14574415 Biochemical_Activity 138 31 Ingvarsdottir et al 15657441 Affinity_Capture-Western 175 32 Lesage et al 15166135 Synthetic_Lethality 292 33 Lesage et al 15715908 Synthetic_Lethality 323 34 Pan et al 15525520 Synthetic_Lethality 124 35 Loeillet et al 15725626 Synthetic_Lethality 214 36 Daniel et al 16157669 Synthetic_Lethality 4535 37 Pan et al 16487579 Synthetic_Growth_Defect 7076 38 Krogan et al 16554755 Affinity_Capture-MS 6531 39 Gavin et al 16429126 Affinity_Capture-MS 107 40 Milgrom et al 16118188 Synthetic_Lethality 215 41 Measday et al 16172405 Two-hybrid 477 42 Graumann et al 14660704 Affinity_Capture-MS 4179 43 Ptacek et al 16319894 Biochemical_Activity 103 44 Frazier et al 16476776 Co-fractionation 290 45 Ye et al 16729061 Synthetic_Rescue 3416 46 Schuldiner et al 16269340 Phenotypic_Enhancement 14421 47 Collins et al 17314980 Phenotypic_Enhancement 9064 48 Collins et al 17200106 Affinity_Capture-MS 117 49 Aronova et al 17507646 Co-fractionation 576 50 Wong et al 17634282 Two-hybrid 234
How do we represent… In silico Analysis Derived data Laboratory Experiments
http://en.wikipedia.org/wiki/Image:Screw_thread_Z%C3%A1vit_M16.jpghttp://en.wikipedia.org/wiki/Image:Screw_thread_Z%C3%A1vit_M16.jpg http://en.wikipedia.org/wiki/Image:Joseph_whitworth.jpg Joseph Whitworth
The need for standards! • “established by consensus and approved by a recognized body, that provides, […] rules, […] for […] the optimum degree of order in a given context” • BSI - • http://www.bsi-global.com/en/Standards-and-Publications/About-standards/Glossary/
View from microarrays Content Standard – Minimal Information MO -- Terminology MAGE -- Structure From the MGED society
MINI – electrophysiology • General Features • Study Subject • Recording Location • Task • Stimulus • Recording • Time Series Data
Recording Location • Recording Location Structure • Brain Area • Slice Thickness • Slice Orientation • Cell Type • Cell Type co-ordintates • Location conformation
View from microarrays Content Standard – Minimal Information MO -- Terminology MAGE -- Structure From the MGED society
Functional Genomics Experiment (FuGE) • Model of common components in science investigations, such as materials, data, protocols, equipment and software. • Provides a framework for capturing complete laboratory workflows, enabling the integration of pre-existing data formats.
‘Folate’+ - + - ‘MMS’ - - + + Part of CISBAN in a nutshell Screen mutants for sensitivity to damage/nutrition * * * Robot Robot • Data curation. • Functional analysis. • Interactions with in silico • programme. Reference set of 5,000 mutant strains
CISBAN dataflow Neil Wipat, Newcastle University
Data Entry with SYMBA http://symba.sourceforge.net/ Allyson Lister, Newcastle University
Summary • We are generating metadata “standards” for neurosciences • We are following a well-trodden path from bioinformatics • We adopted FuGE and have built MINI
Future Work • More neurosciences experimental datatypes. • Minimal Information about a Service • Describe analysis software as well as lab experiments. • Outreach!
Acknowledgements MINI: Frank Gibson, Paul G Overton, Tom V Smulders, Simon R Schultz, Stephen J Eglen, Colin D Ingram, Stefano Panzeri, Phil Bream, Evelyne Sernagor, Mark Cunningham, Christopher Adams, Christoph Echtermeyer, Jennifer Simonotto, Marcus Kaiser, Daniel C Swan, Martyn Fletcher, Phillip Lord CISBAN: Anil Wipat (PI), Allyson Lister (Research Associate),