140 likes | 226 Views
WP 4: ( Data Interoperability and Management). WP3.2: Mobility Brokerage Service. WP4: Data Interoperability and Management. Presentation: Victor Maojo UPM. GOALS. To detect existing gaps and opportunities for BMI research on: Ontologies and Data models Heterogeneous Database integration
E N D
WP 4: (Data Interoperability and Management) WP3.2: Mobility Brokerage Service WP4: Data Interoperability and Management Presentation: Victor MaojoUPM
GOALS • To detect existing gaps and opportunities for BMI research on: • Ontologies and Data models • Heterogeneous Database integration • Security • To find the adequate methods and tools to solve specific issues that might appear in the rest of the WPs.
STATE OF THE ART WP3.2: Mobility Brokerage Service SoA document publicly available
ON-GOING WORK FOR WP4 • Phenotype data model • Development of PML (Polymorphism Markup Language) • Involvement in developing standards for the systems biology community (graphical notation and ontologies for pathways, at www.sbgn.org) • Updating ontology language representations to OWL • Using ontologies for database homogeneization • Privacy-enhanced data storage for the pilots
PHENOTYPE DATA MODEL HGVbase-G2P 2006-... PHENOTYPE Unique entry ID General Specific Name Category Disease Area Method Sub-Phenotype IDs Free Text Citations Keywords Name Attribute Value Units Qualifiers ...based upon a core entity-attribute-value (EAV) triplet model, adaptedto enable any depth, breadth, or type of phenotype to be represented ...envisage a 'dynamic' ontology system
ONTOFUSION + Data mining Ontofusion + Preprocessing Results Inconsistencies Ontology-based model for database integration and mining
OntoDataClean Integration at an Instance level • An Ontology as a framework to identify inconsistencies • Terminology • Scale • Format • Patterns • Missing Values • … • automatic homogenization of databases
PREPROCESSING ONTOLOGY OntoDataClean Order Source DB Fields Data Source Cleaning Model URL Missing Values Format Scale Pattern Duplicate Rule Data Type Synonym Expression String Regular Expression Synonym Database Values Replacement URL Name Condition Detection Transformation Preferred Name Missing Value Ranges Condition Representative Values Average Column Replacement Schema of the OntoDataClean Preprocessing Ontology Representative Values Most Frequently Replacement Condition Value Ranges Row Removal String Replacement
HOMOGENIZATION RESULTS XML Tabulated text format
RESULTS: BMI papers Three papers “in press” in major Biomedical Informatics journals (Computers in Biology and Medicine, Journal of Biomedical Informatics, Methods of Information in Medicine)
LINKS TO WP 6.2 • Synonym tool • Workflow with XML input and output files for supporting systems biology research • Pathway representation
INFORMA PROPOSAL Current Pending Roberto Ricci - Informa INFOBIOMED: WP4 (DATA INTEROPERABILITY AND MANAGEMENT) • Proposal of a BMI demonstrator focused on the integration of viral genomics with clinical data in HIV infection • Create a new public DB by integrating ARCA with genotype-response DBs, host data and other publicly available resources • Develop a dedicated web site and a platform for data flow and linking
LINKS WITH 6.3 (ACTA) Data Data • Information • - Clinical • Genetic • Environment (2) (1) (3) DWH XML files (5) (6) DATA MINING (Results) Visualization (4) External Databases SNPs, etc
SUMMARY: EFFORTS WITH PILOTS WP4 6.2 6.3 6.4 Proposed after 1st review End of 2006 (2007?)