330 likes | 401 Views
RightField The Semantic Annotation of Experimental Data using Spreadsheets,. Katy Wolstencroft, Stuart Owen, Matthew Horridge, Olga Krebs, Wolfgang Mueller Carole Goble. RightField.
E N D
RightFieldThe Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft, Stuart Owen, Matthew Horridge, Olga Krebs, Wolfgang Mueller Carole Goble
RightField A tool for embedding ranges of ontology terms into spreadsheets to allow the users of those spreadsheets to add semantic annotations from simple drop-down lists
RightField • A tool for embedding ranges of ontology terms into spreadsheets to allow the users of those spreadsheets to add semantic annotations from simple drop-down lists Why? • Makes annotation quicker and more efficient • Standardises annotation • Hides the ontology complexity from the users
Managing Biological Data Describeexperiments and results of experiments Minimal Information Models Guidelines, Checklists, vocabularies Necessary for publication, submission to public databases and sharing
Managing Biological Data Describeexperiments and results of experiments Minimal Information Models Guidelines, Checklists, MIACAMinimal Information About a Cellular Assay MIAMEMinimum Information About a Microarray Experiment MIAPEMinimum Information About a Proteomics Experiment MIAREMinimum Information About a RNAi Experiment MIASEMinimum Information About a Simulation Experiment MIBBI >30
Managing Biological Data Describeexperiments and results of experiments Ontologies and Vocabularies for Annotation Gene Ontology ChEBI MGED SBO BioPortal >270 biomedical ontologies
SysMO: Systems Biology of Micro-Organisms SysMO Consortium SysMO-DB SysMO-SEEK – a platform for systems biology data sharing Web based environment for sharing in the consortium and disseminating to the community Used in other consortia: Virtual Liver, EraSysBio+, UNICELLSYS and more.... • Pan-European consortium • > 100 research groups • > 320 scientists • Distributed, interdisciplinary projects • Expected to pool data and results and disseminate • Microbiologists, molecular biologists, biochemists, mathematicians....not many informaticians
Associating Experiments SOP SOP SOP Investigation Study Assay Construction Validation http://isatab.sourceforge.net/
Data Templates and Vocabularies SOP SOP SOP Metabolomics Proteomics Metabolomics Mass Spec Fluxomics Transcriptomics Construction Validation
Fitting in with Laboratory practices • Scientists can continue to do what they have always done • Embedding semantics into the tools already in use • Excel, excel, excel.....
The End Result Ontology terms for marked-up cells in drop-down boxes
How it Works Marked-up workbook Saved in plain Excel Excel Workbook RightField Client Terms Embeddedinto Excel Workbook Ontology “Portion” of ontology terms End Users Informaticians/ontologists
Loading Ontologies Publishedontologies Multiple versions You can also load local ontologies from file or URL
Excel workbook loaded into RightField with multiple worksheets
Class hierarchies of loaded ontologies
Selected parent term from the ontology Methods for specifying ontology terms Term lists for selected cells
The User View Ontology terms for marked-up cells in drop-down boxes
Ontology Information • Ontologies encapsulated • Scientists can work offline • Ensures same versions of ontologies used for a series of experiments • No special macros or plugins required, just Excel or Open Office • Versions and URIs captured in hidden worksheets • Provenance • Comparisons between sheets • Linking back to the vocabularies
Provenance The human readable term label Term Label The (unique) term identifier Term IRI The ontology that defines the term Ontology IRI The version of the ontology Ontology Version The (web) location of the ontology Physical Location
RightField Technologies Java Platform Independent OWL API Loading ontologies and reasoning Apache POI HSSF libraries Loading and saving of Excel Spreadsheets
Ontology Languages RDFS - RDF Schema OWL - Web Ontology Language OBO - Open Biomedical Ontologies
RightField in Use • SysMO – Systems Biology of MicroOrganisms • E-Lico - a virtual laboratory for interdisciplinary collaborative research in data mining and data-intensive sciences. Case Studies in kidney research • BioBanking in the Netherlands Outside Biology • Oil and Gas industry • Egyptology specimen classification
Using RightField Spreadsheets Populate RDF Graph Extract Store / Reuse
Future Developments • Auto-complete • Validation of annotation • Populating ontology content - Populous
Populous • Generic tool for populating ontology templates • Supports validation at the point of data entry • Expressive Pattern language for OWL Ontology generation • Helps biologists with ontology design patterns http://www.e-lico.eu/populous Simon Jupp, Robert Stevens, University of Manchester
Availability • Open source • http://www.rightfield.org.uk
Acknowledgements Stuart Owen Katy Wolstencroft Carole Goble Wolfgang Mueller Olga Krebs Matthew Horridge