420 likes | 546 Views
Pheno-OM implement, evaluate, refine. Morris A. Swertz , Tomasz Adamusiak, Juha Muilu, members of GEN2PHEN and Geneva workshop, Helen Parkinson P3G data modelling workshop October 1st, Luxembourg. Use cases. Use case: Give overview of equal/partial matching features between studies
E N D
Pheno-OMimplement, evaluate, refine Morris A. Swertz, Tomasz Adamusiak, Juha Muilu, members of GEN2PHEN and Geneva workshop, Helen Parkinson P3G data modelling workshop October 1st, Luxembourg
Use cases • Use case: • Give overview of equal/partial matching features between studies • Need to group variables for this (for inferred features) => • Alternative coding schemes • So mappings between codes • How about complicated mappings?
Pheno-OM • Simple system for phenotype representation • minimal model, but not too minimal • unambiguous entity naming (so not ‘phenotype’) • work on real data • multiple investigations in one envelop • easy to create/parse/convert-into format • interface/modules that can be adopted by others • ontology enabled for querying/integration “Give me all individuals that have deformed hand … across studies … across species”
Outline • Implementation procedure • Software generation • Exchange format, db, UI, tools • Evaluation and model refinement • Data loaded • Model details (and limitations?) • Future • Mapping to FuGE, XGAP, MAGE-TAB, PaGE • As a module to complementary models • Semantic layers • Barriers to progress
Implementation procedure Incremental steps using MOLGENIS toolbox.
MOLGENIS concept Model of a variant <!-- entity organization --> <entityname="Experiment"label="Experiment"> <fieldname="ExperimentID"key="1“ readonly="true" label="ExperimentID(autonum)"/> <fieldname="Medium" type="xref" xref_field="Medium.name"/>/> <fieldname="Protocol" label="Experiment Protocol"/> <fieldname="Temperature"type="int" Model of a variant <!-- entity organization --> <entityname="Experiment"label="Experiment"> <fieldname="ExperimentID"key="1“ readonly="true" label="ExperimentID(autonum)"/> <fieldname="Medium" type="xref" xref_field="Medium.name"/>/> <fieldname="Protocol" label="Experiment Protocol"/> <fieldname="Temperature"type="int" Reusable software code framework and generators Automate common coding (informatics) Model specifics (biology) + 10.000 10.000 strains genome markers inbreed 100 1,000,000 10,000 individuals genotype genotypes map QTL profiles correlate 100,000 10,000,00 hybridize expressions preprocess norm exprs. network 100 100,000 microarrays probes Repeat to produce a family of research software Complex research Bespoke infrastructure
Growing family, sharing the work Locus Specific database QTL/GWA database NextGen sequencing Proteo/Metabolomics Animal Observations
Why? Flexible, Reuse, etc. http://www.molgenis.org http://www.molgenis.org Swertz & Jansen (2007) Nature Reviews Genetics 8, 235-243
Step 1a: model 1: molgenis_db.xml 506 lines of XML code 16 entities, 67 fields
0 INFO [myFactory] working dir: D:\Development\Molgenis33Workspace\molgenis4phenotype 78 INFO [myFactory] MOLGENIS version 3.3.0-testing 94 INFO [myFactory] Using options: model_database = [pheno_db.xml] #File with data structure specification (in MOLGENIS DSL). model_userinterface = pheno_ui.xml #File with user interface specification (in MOLGENIS DSL). Can be same file as model_database output_src = generated/java #Output-directory for the generated project. output_hand = handwritten/java #Output-directory for the generated project. output_sql = generated/sql #Output-directory for the generated sql files. output_doc = WebContent/doc #Output-directory for the generated documentation. output_type = #Output type of the project, either war (for use in tomcat) or jar (standalone). output_web = WebContent #Output-directory for any generated web resources db_driver = com.mysql.jdbc.Driver #Driver of database. Any JDBC compatible driver should work. db_user = molgenis #Username for database. db_password = xxxxxx #Password for database. db_uri = jdbc:mysql://localhost/pheno #Uri of the database. Default: localhost db_filepath = attachedfiles #Path where the database should store file attachements. Default: null db_jndiname = jdbc/molgenisdb #Used to create a JDBC database resource for the application object_relational_mapping = subclass_per_table #Expert option: Choosing OR strategy. Either 'class_per_table', 'subclass_per_table', 'hierarchy_per_table'. Default: class_per_table mapper_implementation = multiquery #Expert option: Choosing wether multiquery is used instead of prepared statements. Default: false exclude_system = true #Expert option: Whether system tables should be excluded from generation. Default: true force_molgenis_package = false #Expert option. Whether the generated package should be 'molgenis' or the name specified in the model. Default: false auth_loginclass = org.molgenis.framework.security.SimpleSecurity #Expert option. verbose = true #This switch turns the verbose-mode on. compile = false #This switch makes the factory also compile (usefull outside IDE). mail_smtp_protocol = #Sets the email protocol, either smtp, smtps or null. Default: null meaning email disabled mail_smtp_hostname = localhost #SMTP host server. Default: localhost mail_smtp_port = 25 #SMTP host server port. Default: 25 mail_smtp_user = #SMTP user for authenticated emailing. Default: null. mail_smtp_password = #SMTP user for authenticated emailing. Default: null. 110 INFO [MolgenisLanguage] parsing db-schema from [pheno_db.xml] 780 WARN [Entity] [WARNING]: missing key 0 for entity Nameable 780 WARN [Entity] [WARNING]: missing key 0 for entity Nameable 780 WARN [Entity] [WARNING]: missing key 0 for entity Nameable 780 WARN [Entity] [WARNING]: missing key 0 for entity Nameable 780 WARN [Entity] [WARNING]: missing key 0 for entity Nameable 797 WARN [Entity] [WARNING]: missing key 0 for entity Nameable 797 WARN [Entity] [WARNING]: missing key 0 for entity Nameable 844 INFO [MolgenisLanguage] parsing ui-schema 937 INFO [main] generating .... 1717 INFO [TableDocGen] generated WebContent\doc\tabledoc.html 2076 INFO [EntityDocGen] generated WebContent\doc\objectmodel.html 2436 INFO [DotDocGen] generated WebContent\doc\entity-uml-diagram.dot 2545 INFO [DotDocGen] generated WebContent\doc\entity-uml-diagram-pheno.system.dot 2748 INFO [DotDocGen] generated WebContent\doc\entity-uml-diagram-pheno.observation.dot 2842 INFO [DotDocGen] generated WebContent\doc\entity-uml-diagram-pheno.target.dot 2998 INFO [DotDocGen] generated WebContent\doc\entity-uml-diagram-pheno.variable.dot 3138 INFO [DotDocGen] generated WebContent\doc\entity-uml-diagram-pheno.protocol.dot 3997 INFO [DotDocMinimalGen] generated WebContent\doc\entity-uml-minimal-diagram.dot 4184 INFO [DotDocMinimalGen] generated WebContent\doc\entity-uml-diagram-minimal-pheno.system.dot 4388 INFO [DotDocMinimalGen] generated WebContent\doc\entity-uml-diagram-minimal-pheno.observation.dot 4606 INFO [DotDocMinimalGen] generated WebContent\doc\entity-uml-diagram-minimal-pheno.target.dot 4731 INFO [DotDocMinimalGen] generated WebContent\doc\entity-uml-diagram-minimal-pheno.variable.dot 4887 INFO [DotDocMinimalGen] generated WebContent\doc\entity-uml-diagram-minimal-pheno.protocol.dot 5184 INFO [ClassDocGen] generated WebContent\doc\classmodel.html 5293 INFO [InMemoryDatabaseGen] generated generated\java\ui\data\InMemoryDatabase.java 5609 INFO [MySqlCreateSubclassPerTableGen] generated generated\sql\create_tables.sql 5671 INFO [JDBCDatabaseGen] generated generated\java\ui\JDBCDatabase.java 5921 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\Identifiable.java 5921 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\Nameable.java 5968 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\OntologySource.java 6014 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\OntologyTerm.java 6030 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\Investigation.java 6061 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\ObservableFeature.java 6124 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\ObservedValue.java 6170 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\ObservedRelationship.java 6217 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\InferredValue.java 6233 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\ObservationTarget.java 6280 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\Individual.java 6311 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\Panel.java 6326 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\CodeList.java 6327 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\Code.java 6374 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\Protocol.java 6390 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\ProtocolApplication.java 6405 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\ProtocolParameter.java 6437 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\ParameterValue.java 6452 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\InferredValue_derivedFrom.java 6468 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\Panel_individuals.java 6483 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\Protocol_observableFeatures.java 6499 INFO [DataTypeGen] generated generated\java\pheno\core\data\types\Protocol_protocolComponents.java 6624 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\OntologySourceMapper.java 6655 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\OntologyTermMapper.java 6671 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\InvestigationMapper.java 6702 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\ObservableFeatureMapper.java 6733 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\ObservedValueMapper.java 6780 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\ObservedRelationshipMapper.java 6827 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\InferredValueMapper.java 6842 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\ObservationTargetMapper.java 6873 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\IndividualMapper.java 6889 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\PanelMapper.java 6905 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\CodeListMapper.java 6936 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\CodeMapper.java 6951 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\ProtocolMapper.java 6983 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\ProtocolApplicationMapper.java 6998 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\ProtocolParameterMapper.java 7029 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\ParameterValueMapper.java 7045 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\InferredValue_derivedFromMapper.java 7061 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\Panel_individualsMapper.java 7076 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\Protocol_observableFeaturesMapper.java 7092 INFO [MultiqueryMapperGen] generated generated\java\pheno\core\data\mappers\Protocol_protocolComponentsMapper.java 7217 INFO [JDBCMetaDatabaseGen] generated generated\java\ui\JDBCMetaDatabase.java 7263 INFO [CountPerEntityGen] generated generated\sql\count_per_entity.sql 7310 INFO [CountPerTableGen] generated generated\sql\count_per_table.sql 7341 INFO [FillMetadataTablesGen] generated generated\sql\insert_metadata.sql 7405 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\OntologySourceCsvReader.java 7420 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\OntologyTermCsvReader.java 7420 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\InvestigationCsvReader.java 7436 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\ObservableFeatureCsvReader.java 7452 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\ObservedValueCsvReader.java 7467 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\ObservedRelationshipCsvReader.java 7483 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\InferredValueCsvReader.java 7498 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\ObservationTargetCsvReader.java 7514 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\IndividualCsvReader.java 7514 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\PanelCsvReader.java 7530 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\CodeListCsvReader.java 7545 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\CodeCsvReader.java 7545 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\ProtocolCsvReader.java 7561 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\ProtocolApplicationCsvReader.java 7561 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\ProtocolParameterCsvReader.java 7576 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\ParameterValueCsvReader.java 7576 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\InferredValue_derivedFromCsvReader.java 7592 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\Panel_individualsCsvReader.java 7608 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\Protocol_observableFeaturesCsvReader.java 7608 INFO [CsvReaderGen] generated generated\java\pheno\core\data\csv\Protocol_protocolComponentsCsvReader.java 7748 INFO [REntityGen] generated generated\java\pheno\core\R\OntologySource.R 7748 INFO [REntityGen] generated generated\java\pheno\core\R\OntologyTerm.R 7764 INFO [REntityGen] generated generated\java\pheno\core\R\Investigation.R 7779 INFO [REntityGen] generated generated\java\pheno\core\R\ObservableFeature.R 7779 INFO [REntityGen] generated generated\java\pheno\core\R\ObservedValue.R 7795 INFO [REntityGen] generated generated\java\pheno\core\R\ObservedRelationship.R 7795 INFO [REntityGen] generated generated\java\pheno\core\R\InferredValue.R 7810 INFO [REntityGen] generated generated\java\pheno\core\R\ObservationTarget.R 7810 INFO [REntityGen] generated generated\java\pheno\core\R\Individual.R 7826 INFO [REntityGen] generated generated\java\pheno\core\R\Panel.R 7826 INFO [REntityGen] generated generated\java\pheno\core\R\CodeList.R 7842 INFO [REntityGen] generated generated\java\pheno\core\R\Code.R 7857 INFO [REntityGen] generated generated\java\pheno\core\R\Protocol.R 7857 INFO [REntityGen] generated generated\java\pheno\core\R\ProtocolApplication.R 7873 INFO [REntityGen] generated generated\java\pheno\core\R\ProtocolParameter.R 7873 INFO [REntityGen] generated generated\java\pheno\core\R\ParameterValue.R 7888 INFO [REntityGen] generated generated\java\pheno\core\R\InferredValue_derivedFrom.R 7888 INFO [REntityGen] generated generated\java\pheno\core\R\Panel_individuals.R 7888 INFO [REntityGen] generated generated\java\pheno\core\R\Protocol_observableFeatures.R 7904 INFO [REntityGen] generated generated\java\pheno\core\R\Protocol_protocolComponents.R 7998 INFO [RApi] generated generated\java\source.R 8044 INFO [HtmlFormGen] generated generated\java\pheno\core\html\IdentifiableHtmlForm.java 8044 INFO [HtmlFormGen] generated generated\java\pheno\core\html\NameableHtmlForm.java 8044 INFO [HtmlFormGen] generated generated\java\pheno\core\html\OntologySourceHtmlForm.java 8044 INFO [HtmlFormGen] generated generated\java\pheno\core\html\OntologyTermHtmlForm.java 8060 INFO [HtmlFormGen] generated generated\java\pheno\core\html\InvestigationHtmlForm.java 8060 INFO [HtmlFormGen] generated generated\java\pheno\core\html\ObservableFeatureHtmlForm.java 8076 INFO [HtmlFormGen] generated generated\java\pheno\core\html\ObservedValueHtmlForm.java 8076 INFO [HtmlFormGen] generated generated\java\pheno\core\html\ObservedRelationshipHtmlForm.java 8076 INFO [HtmlFormGen] generated generated\java\pheno\core\html\InferredValueHtmlForm.java 8091 INFO [HtmlFormGen] generated generated\java\pheno\core\html\ObservationTargetHtmlForm.java 8091 INFO [HtmlFormGen] generated generated\java\pheno\core\html\IndividualHtmlForm.java 8091 INFO [HtmlFormGen] generated generated\java\pheno\core\html\PanelHtmlForm.java 8091 INFO [HtmlFormGen] generated generated\java\pheno\core\html\CodeListHtmlForm.java 8107 INFO [HtmlFormGen] generated generated\java\pheno\core\html\CodeHtmlForm.java 8107 INFO [HtmlFormGen] generated generated\java\pheno\core\html\ProtocolHtmlForm.java 8107 INFO [HtmlFormGen] generated generated\java\pheno\core\html\ProtocolApplicationHtmlForm.java 8107 INFO [HtmlFormGen] generated generated\java\pheno\core\html\ProtocolParameterHtmlForm.java 8122 INFO [HtmlFormGen] generated generated\java\pheno\core\html\ParameterValueHtmlForm.java 8122 INFO [HtmlFormGen] generated generated\java\pheno\core\html\InferredValue_derivedFromHtmlForm.java 8122 INFO [HtmlFormGen] generated generated\java\pheno\core\html\Panel_individualsHtmlForm.java 8122 INFO [HtmlFormGen] generated generated\java\pheno\core\html\Protocol_observableFeaturesHtmlForm.java 8138 INFO [HtmlFormGen] generated generated\java\pheno\core\html\Protocol_protocolComponentsHtmlForm.java 8138 INFO [MolgenisServletContextGen] generated WebContent\META-INF\context.xml 8169 INFO [MolgenisContextListenerGen] generated generated\java\servlet\ContextListener.java 8232 INFO [MolgenisServletGen] generated generated\java\MolgenisServlet.java 8403 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\InvestigationsForm.java 8560 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Investigations\InvestigationMenu\ObservableFeaturesForm.java 8591 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Investigations\InvestigationMenu\PanelsForm.java 8654 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Investigations\InvestigationMenu\Panels\IndividualsForm.java 8701 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Investigations\InvestigationMenu\ObservedValuesForm.java 8732 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Investigations\InvestigationMenu\ProtocolApplicationsForm.java 8825 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Investigations\InvestigationMenu\ProtocolApplications\ProtocolApplicationMenu\ParameterValuesForm.java 8857 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Investigations\InvestigationMenu\ProtocolApplications\ProtocolApplicationMenu\ObservedValuesForm.java 8888 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Investigations\InvestigationMenu\ProtocolApplications\ProtocolApplicationMenu\InferredValuesForm.java 9013 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Investigations\InvestigationMenu\InferredValuesForm.java 9044 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\ObservableFeaturesForm.java 9137 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\ObservationTargets\IndividualsForm.java 9169 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\ObservationTargets\PanelsForm.java 9200 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\ProtocolsForm.java 9293 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Protocols\ProtocolMenu\ParametersForm.java 9325 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Protocols\ProtocolMenu\ProtocolComponentsForm.java 9496 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Ontologies\OntologyTermsForm.java 9528 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Ontologies\OntologySourcesForm.java 9606 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Ontologies\OntologySources\OntologyTermsForm.java 9638 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Ontologies\CodeListsForm.java 9700 INFO [FormScreenGen] generated generated\java\ui\screen\TopMenu\Main\Ontologies\CodeLists\CodesForm.java 9965 INFO [MenuScreenGen] generated generated\java\ui\screen\TopMenuMenu.java 10012 INFO [MenuScreenGen] generated generated\java\ui\screen\TopMenu\MainMenu.java 10059 INFO [MenuScreenGen] generated generated\java\ui\screen\TopMenu\Main\Investigations\InvestigationMenuMenu.java 10152 INFO [MenuScreenGen] generated generated\java\ui\screen\TopMenu\Main\Investigations\InvestigationMenu\ProtocolApplications\ProtocolApplicationMenuMenu.java 10230 INFO [MenuScreenGen] generated generated\java\ui\screen\TopMenu\Main\ObservationTargetsMenu.java 10293 INFO [MenuScreenGen] generated generated\java\ui\screen\TopMenu\Main\Protocols\ProtocolMenuMenu.java 10324 INFO [MenuScreenGen] generated generated\java\ui\screen\TopMenu\Main\OntologiesMenu.java 11354 INFO [PluginScreenGen] generated Molgenis33Workspace\molgenis4phenotype\generated\java\ui\screen\TopMenu\Main\ReportPlugin.java 11557 INFO [PluginScreenGen] generated Molgenis33Workspace\molgenis4phenotype\generated\java\ui\screen\TopMenu\Main\Ontologies\OntologyManagerPlugin.java 11604 INFO [PluginScreenGen] generated Molgenis33Workspace\molgenis4phenotype\generated\java\ui\screen\TopMenu\Model_documentationPlugin.java 11604 INFO [PluginScreenGen] generated Molgenis33Workspace\molgenis4phenotype\generated\java\ui\screen\TopMenu\RprojectApiPlugin.java 11620 INFO [PluginScreenGen] generated Molgenis33Workspace\molgenis4phenotype\generated\java\ui\screen\TopMenu\HttpApiPlugin.java 11635 INFO [PluginScreenGen] generated Molgenis33Workspace\molgenis4phenotype\generated\java\ui\screen\TopMenu\WebServicesApiPlugin.java 11651 WARN [PluginScreenFTLTemplateGen] Skipped because exists: handwritten\java\plugin\report\InvestigationOverview.ftl 11807 WARN [PluginScreenFTLTemplateGen] Skipped because exists: handwritten\java\plugin\OntologyBrowser\OntologyBrowserPlugin.ftl 11807 WARN [PluginScreenFTLTemplateGen] Skipped because exists: handwritten\java\plugin\topmenu\DocumentationScreen.ftl 11807 WARN [PluginScreenFTLTemplateGen] Skipped because exists: handwritten\java\plugin\topmenu\RprojectApiScreen.ftl 11823 WARN [PluginScreenFTLTemplateGen] Skipped because exists: handwritten\java\plugin\topmenu\HttpAPiScreen.ftl 11823 WARN [PluginScreenFTLTemplateGen] Skipped because exists: handwritten\java\plugin\topmenu\SoapApiScreen.ftl 11854 WARN [PluginScreenJavaTemplateGen] Skipped because exists: handwritten\java\plugin\report\InvestigationOverview.java 12057 WARN [PluginScreenJavaTemplateGen] Skipped because exists: handwritten\java\plugin\OntologyBrowser\OntologyBrowserPlugin.java 12072 WARN [PluginScreenJavaTemplateGen] Skipped because exists: handwritten\java\plugin\topmenu\DocumentationScreen.java 12088 WARN [PluginScreenJavaTemplateGen] Skipped because exists: handwritten\java\plugin\topmenu\RprojectApiScreen.java 12088 WARN [PluginScreenJavaTemplateGen] Skipped because exists: handwritten\java\plugin\topmenu\HttpAPiScreen.java 12088 WARN [PluginScreenJavaTemplateGen] Skipped because exists: handwritten\java\plugin\topmenu\SoapApiScreen.java 12103 INFO [MolgenisServletContextGen] generated WebContent\META-INF\context.xml 12259 INFO [SoapApiGen] generated generated\java\ui\SoapApi.java 12353 INFO [CsvExportGen] generated generated\java\tools\CsvExport.java 12431 INFO [CsvImportByNameGen] generated generated\java\tools\CsvImportByName.java 12636 INFO [CopyMemoryToDatabaseGen] generated generated\java\ui\tools\CopyMemoryToDatabase.java Generate (MOLGENIS + Eclipse) Generates 150 files, 30k lines of Java, SQL and R code + docs (tomcat/mysql; hsqldb, psql, jpa/hibernate, jetty in alpha )
Step 2: evaluate on paper (goto 1) 1: molgenis_db.xml 2: documentation* *autogenerated • Building on: • FuGE • MAGE-TAB • XGAP • METABASE • PaGE
Step 3: evaluate on data (goto 1) 1: molgenis_db.xml 2: documentation* 3: exchange format* *autogenerated • Tools • Db • CsvImport • CsvExport • constants
Step 3: evaluate on data (goto 1) 1: molgenis_db.xml 2: documentation* 3: exchange format* *autogenerated
Step 1b: add ui model 1: molgenis_db.xml 50 lines of XML code (forms, menu’s and plugins)
Result = db + ui + services + tools 1: molgenis_db.xml 2: brainstorm doc* 3: exchange format* 4: back and frontend* *autogenerated Demo: http://wwwdev.ebi.ac.uk/microarray-srv/pheno/ Source: https://svn.gene.le.ac.uk/gen2phen/pheno-model
Edit & trace your data UML documentation of your model Connect to R statistics Workflow ready web-services find.investigation() 102 downloaded obs<-find.observedvalue( 43,920 downloaded #some calculation add.inferredvalue(res) 36 added Import/export to Excel plugin your own scripts (OntBrowse) Tech keywords: object oriented data models, multi-platform java, tomcat/glassfish web server, mysql/postgresql database, Eclipse/Netbeans IDE, Java API, WSDL/SOAP API, R-project API, MVC, freemarker templates and css for custom layout, open source.
Todo: semantic layer 1: molgenis_db.xml 2: brainstorm doc* 3: exchange format* 4: back and frontend* 5: ontology browser *autogenerated • And: • UI for dealing with composite keys • Extend CSV parsers for other formats • Import/Export/Federate wizards to/from data repositories
Model evaluation and refinement Model details after (1) implementation and (2) data loading
Data model overview Core • Investigation • ObservableFeature • ObservationTarget • ObservedValue Feature/Value coding • OntologySource • OntologyTerm • Code • CodeList ObservationTarget subclasses • Individual • Panel ObservedValue subclasses • ObservedRelationship • InferredValue Protocol • Protocol • ProtocolApplication • ProtocolParameter • ParameterValue
Evaluation process • Loading: • 102 studies (MPD, Europhenome, Molpage) • 2,042 observable features • 42,939 Individuals • 287 Panels • 196 protocols • 140 ontology terms • Most on mouse, limited on human • Getting descriptions possible, data is hard • Getting ontology annotated data is harder • Recoding to Pheno-OM TAB format is easy
Core model (required) • Targets, Features, Values have ‘names’ (Nameable) • Names are unique per investigation • Optional: use ontologyterm for semantic harmonization
ObservableFeature • References ontology terms for unambiguous: • Feature definition • Feature unit OntologyTerm Term=UCU/cm ObservableFeature Name=Length(cm) ontologyReference=Length Unit=UCU/cm OntologyTerm Term=Length ObservableFeature Name=Length(inch) ontologyReference=Length Unit=UCU/inch OntologyTerm Term=UCU/inch
ObservableValue - time series data - Repeated observations - time relative data (‘age’) • Standard codes/values
5x Investigation: redundant? Investigation Trial coordination center use case: Investigation 5: • Reuses ‘standard’ Protocols from Investigation 1 • Reuses ‘standard’ Features defined in Investigation 2 • Reuses Individuals from Investigation 3 • Outsourced ProtocolApplication to Investigation 4 • Has values linked Investigation 5 (=self) Protocol ObservableFeature ProtocolApplication ObservationTarget ObservedValue
Protocol (optional) Alternative protocols for features • FuGE/MAGE When? How? (by whom?) Groups observations (medicin) CRF sections and forms Possible Variations Deviations (ontologyRef?) Actual Variations Deviations (ontologyRef?)
Value package (extension) Defined pheno type Defined pheno value How collected • - Systolic • Diastolic • Last dinner • Cholesterol Sibling Household High blood pressure High Cholesterol
Target package (extensions) Pedigree (relationship) Cohorts Other groups (can have values too: non-smoking, 65yr males)
Coding package (extension) Code List = categorical type of unit = term Code = ontology term, part of list PhenX example CodeList < OntologyTerm Term=codelist/sex ontologyReference= … ObservableFeature Name=Sex ontologyReference=Sex Unit=codelist/sex Code < OntologyTerm Code=1 Term=Transsexual, m-> f ontologyReference= … ObservedValue Value=1 ontologyReference=Transsexual, m-> f
Principle is enable, not constrain • Partial usage • Features without Observations (just exchange designs) • Observations without Protocols (just exchange results) • Harmonization optional • Use locally unique names • Use external application ontology for harmonization • (but can via reuse of features between investigations) • Local and external ontologies • Only one term per item to prevent ambiguaty • Don’t try to be everything to everybody • Semantic maps (synonyms, part of, subclass) outside
Future Integrate, integrate, integrate
Integration on data structure • Did import of BioMart, MPD, MAGE-TAB • Need import/export to main repositories
Integration on components • Other omics phenotypes • XGAP, MAGE, PaGE, Gen2Phen models for genotype/lsdb etc) • Other formats • MAGE-TAB extension • XGAP extension • Gen2Phen module assembly • dbGaP/EGA pipelines
Integration on components dimension ELEMENT columns rows • Other omics phenotypes (XGAP, MAGE, PaGE, Gen2Phen models for genotype/lsdb etc) SUBJECT • Panel • Name • Type: CSS, RIL.. • Parent Panels • INDIVIDUAL • Name • Strain • Mother • Father • Sex • SAMPLE • Name • Individual • Tissue And so on … TRAIT • PROBE • Name • Gene • Chromosme • Locus • MARKER • Name • Allele • Chromosme • Locus • MASSPEAK • Name • MZ • RetentionTime And so on … DATA ELEMENT
1. Data model Looking at standards and existing data sets Simple enough for everybody to create Genotype data Subjects: STRAINS M A R K E R S DATA ELEMENTS T r a i t s: TRAIT SUBJECT
QTL • GWAS • Genotype • Expression • MassSpec • NMR • Etc. • Extends • FuGE • fuge.sourceforgen.net
Edit & trace your data Connect to R statistics Workflow ready web-services UML documentation of your model Import/export to Excel plugin your own scripts (R/QTL) Tech keywords: object oriented data models, multi-platform java, tomcat/glassfish web server, mysql/postgresql database, Eclipse/Netbeans IDE, Java API, WSDL/SOAP API, R-project API, MVC, freemarker templates and css for custom layout, open source. eXtensible Genotype And Phenotype platform http://www.xgap.org
Integration on semantics • Man/mouse mapping ontologies • Integrate data on equivalent features • Usage: R2D server + sparql? • Semantic search/query expansion • Exploit synonyms, part-of, subclass, equivalence • Usage: super search box • Barriers to progress • Lack of cross species mapping ontologies • Lack of public, ontology annotated, human data “Give me all individuals that have deformed hand … across studies … across species”
Pheno-OM summary • Simple system for phenotype representation • minimal conceptual model, but not too minimal • unambiguous entity naming (so not ‘phenotype’) • works on real data • exchange multiple investigations in one envelop • easy to create/parse/convert-into format • interface/modules that can be adopted by others ~ontology enabled for querying/integration (NOW) Demo and docs: http://wwwdev.ebi.ac.uk/microarray-srv/pheno/ Source: https://svn.gene.le.ac.uk/gen2phen/pheno-model
Example • Questionnaire Q1 (protocol) • Observes low/high blood pressures (features) • Using X blood pressure device (protocol) • On patients 200-300 (persons) • At 14 sep 2009 (protocol application) • Resulting in values (125/90, 120/80, 120,75) • In Hg (unit) • At time = x+5,10,15mins (value.timestamp)
Implementation Model file XML customize... user interaction infrastructure MyScript Plugins Generate FormGen TreeGen MenuGen PluginGen APIs in Java, R, Web services and HTTP MatrixGen JDBCMapGen Communication infrastructure JTypeGen JReadCsvGen JListGen RListGen JDatabaseGen RMatrixGen HSQLGen WSGen data infrastructure MySQLGen