280 likes | 427 Views
LIMS – laboratory info mngmnt system - AKA study capturing framework - AKA sample treatment tracker - AKA investigation metadata annotator.
E N D
LIMS – laboratory info mngmnt system- AKA study capturing framework - AKA sample treatment tracker - AKA investigation metadata annotator Opwarmer for discussion on the harmonization of similar initiatives in NBIC sequencing, metabolomics, protomics and biobanking task forces (+friends like NuGO, EBI, GEN2PHEN, BBMRI-NL, SysMO, EU-PANACEA, Groningen Genomics Coordination Center).
Outline • What do we mean with LIMS / SCS? • Ingredients for collaboration • Suggestive discussion topics
LIMS/SCF is portal for data analysis data annotation • Individuals • Samples • Protocols • Results • Background info • Peak finding • SNP analysis • GWAS • xQTL • ... data generation • Sequencing • Genotyping • Microarrays • Mass spec • …
DSP/NuGO Examples Courtesy Kees van Bochove & team, NMC & NuGO
Proteomics Examples • CilairDB • Corra • OpenBIS • OpenMS • … http://www.cisd.ethz.ch/software/openBIS/HCS
Sequencing/Genotyping Examples • SequenceLIMS • ChIPLIMS Nijmegen • GenotypeLIMS • IBIDAS? • iSeq • … Courtesy Joris Lops, GCC & LifeLines
Biobanking Examples • QTL XGAP/EU-PANACEA • GWAS XGAP/LifeLines • HGVbaseG2P 2.0 • … Courtesy Joeri van derVelde & friends, GCC & LifeLines
Working hypotheses • Each platform has one or more study ‘portals’ • Captures all wet-lab and dry-lab flows • Links to (or copies from) public annotations • Provides value and data inputs for pipelines • Stores provenance and results of all pipeline runs (as result files) • All tools developed in BioAssist will be connected to them • Need to think on user interaction • Need to think on data exchange (formats) • i.e. what does the biologist want? • We can benefit greatly if we harmonize and share work • Each domain has specific needs but we can still share • Data models, User Interfaces, Back-ends, … • Coordination of this a task of CET?
Ingredients for collaboration • Conceptual model • To capture all data, including variation/extension mechanisms • Exchange formats • To exchange between public and private databases • User interfaces • Data import wizards • Extraction / query modules • Platforms for analysis!!! • Backend engines • Large scale binary data • Automatic generation of services/pipelines
1. Conceptual model • Targets: the thing being followed AKA: Individuals, Sample, Panels/Groups, Material • Features: a abstract property of a target AKA: Characteristics, Comments, • Values: a concrete property of target (at a certain time) AKA: Data • Protocols: description of an activity AKA: EventType, Template • ProtocolApplications: use of protocol that produced (a) value AKA: Events, Activity, Assay • Investigation: some container of above + contacts/publications AKA: Study, Project, Laboratory, Partner
‘Pheno-OM’ (generic variation mechanism) Observable feature Flexible: any feature, value, and target combo Protocol * * Height time * * Observation target Observed value Ind1 179cm Protocol application * * time time Panel Individual Observed Relation Inferred Value * * NL EBI NL
dimension ELEMENT columns rows XGAP (extension based variation mechanism) SUBJECT • Panel • Name • Type: CSS, RIL.. • Parent Panels • INDIVIDUAL • Name • Strain • Mother • Father • Sex • SAMPLE • Name • Individual • Tissue And so on … TRAIT • PROBE • Name • Gene • Chromosme • Locus • MARKER • Name • Allele • Chromosme • Locus • MASSPEAK • Name • MZ • RetentionTime And so on … DATA ELEMENT NL Swertz et al (2010) Genome Biology 11(3). NL
ISA-TAB(generic model) Differs from MAGE-TAB • Nested investigations (as studies) • To have templates assays • More aligned to FuGE • But some find it too difficult ISA = • Investigation • Study (Investigation component) • Assay (a component of Study) • Data files Still in testing phase though… http://isatab.sf.net
MIBBI • MIBBI Minimum Information for Biological and Biomedical Investigations (total 31 areas) http://mibbi.sourceforge.net Taylor et al 2008 Nature Biotechnology 8, p 889
Ingredients for collaboration • Conceptual model • To capture all data, including variation/extension mechanisms • Exchange formats • To exchange between public and private databases • User interfaces • Data import wizards • Extraction / query modules • Platforms for analysis!!! • Backend engines • Large scale binary data • Automatic generation of services/pipelines
2. Data formats Basic • CSV • XML • RDF/Atom Specific • MAGE-TAB • MOLGENIS • APML • …
Edit & trace your data UML documentation of your model Connect to R statistics Workflow ready web-services 3. User interfaces find.investigation() 102 downloaded obs<-find.observedvalue( 43,920 downloaded #some calculation add.inferredvalue(res) 36 added 17 Import/export to Excel plugin your own scripts (OntBrowse) Tech keywords: object oriented data models, multi-platform java, tomcat/glassfish web server, mysql/postgresql database, Eclipse/Netbeans IDE, Java API, WSDL/SOAP API, R-project API, MVC, freemarker templates and css for custom layout, open source.
3. User interfaces (import wizards) ADD PICTURE OF GSCF http://www.obofoundry.org/ http://bioportal.bioontology.org/ REST services http://www.ebi.ac.uk/ontology-lookup/ SOAP services http://ontocat.sf.net– Simple API around bioportal
3. User interfaces (compute platform) Courtesy Arends & van derVelde
Things to discuss as next steps? Put all people/tools in this room on the table • Agree on exchange formats & models (generic/specific) • Test drive data exchange or even federation Share the work • Communicate requirements and plans • Reuse each other user interface components • Share scalable back-ends (for high throughput data) Invest in technology interoperation • Invest in Galaxy callback to MOLGENIS/Grails (data chooser)? • Invest in a MOLGENIS to Grails generator (must be easy)? Something for NBIC mgmt team to think about
Acknowledgements • Morris Swertz, Kees van Bochove, Erik Roos, Joris Lops, Joeri van der Velde, GEN2PHEN, MAGE-TAB, XGAP, ISA-TAB, FuGE, GSCF teams