370 likes | 503 Views
Introduction to Interoperability in the Cancer Biomedical Informatics Grid ™. Peter A. Covitz, Ph.D. Chief Operating Officer National Cancer Institute Center for Bioinformatics. caBIG™ Introductory Seminars: March 2006. Topics caBIG™ Overview – March 13
E N D
Introduction to Interoperability in the Cancer Biomedical Informatics Grid™ Peter A. Covitz, Ph.D. Chief Operating Officer National Cancer Institute Center for Bioinformatics Filename/RPS Number
caBIG™ Introductory Seminars: March 2006 • Topics • caBIG™ Overview – March 13 • Overview of caBIG™ Activities for Clinical Trials and Tissue Banking – March 15 • Overview of caBIG™ Activities for Integrated Cancer Research – March 16 • caBIG™ Interoperability and Compatibility Basics – March 17 • https://cabig.nci.nih.gov/seminars • http://videocast.nih.gov/ Filename/RPS Number
Eliminate suffering and death due to cancer by the year 2015 National Cancer Institute 2015 Goal Filename/RPS Number
Origins of caBIG™ • Need: Enable investigators and research teams nationwide to combine and leverage their findings and expertise in order to meet NCI 2015 Goal. • Strategy: Create scalable, actively managed organization that will connect members of the NCI-supported cancer enterprise by building a biomedical informatics network Filename/RPS Number
Scenario from caBIG™ Strategic Plan A researcher involved in a phase II clinical trial of a new targeted therapeutic for brain tumors observes that cancers derived from one specific tissue progenitor appear to be strongly affected. The trial has been generating proteomic and microarray data. The researcher would like to identify potential biochemical and signaling pathways that might be different between this cell type and other potential progenitors in cancer, deduce whether anything similar has been observed inother clinical trials involving agents known to affect these specific pathways, and identify any studies inmodel organisms involving tissues with similar pathway activity. Filename/RPS Number
Interoperability ability of a system to access and use the parts or equipment of another system Syntacticinteroperability Semanticinteroperability Filename/RPS Number
SEMANTIC SEMANTIC SEMANTIC SYNTACTIC caBIG™ Compatibility Guidelines Filename/RPS Number
Model-Driven Architecture Filename/RPS Number
MDA Approach • Analyze the problem space and develop the artifacts for each scenario • Use Cases • Use Unified Modeling Language (UML) to standardize model representations and artifacts. Design the system by developing artifacts based on the use cases • Class Diagram – Information Model • Sequence Diagram – Temporal Behavior • Use meta-model tools to generate the code Filename/RPS Number
Limitations of MDA • Limited expressivity for semantics • No facility for runtime semantic metadata management Filename/RPS Number
MDA plus a whole lot more! caCORE Filename/RPS Number
S E C U R I T Y Bioinformatics Objects Common Data Elements Enterprise Vocabulary caCORE Filename/RPS Number
Bioinformatics Objects Filename/RPS Number
mzXML mass spec proteomics data scanFeatures Proteomics AML Proteomics statml Statistical markup model CAP College of American Pathologists protocols for Breast, Lung, Prostate GoMiner Text mining tool for GO caTISSUE Tissue banking protLIMS Laboratory Information Management System for proteomics BRIDG Clinical Trials caBIO General bioinformatics caDSR ISO11179 metadata EVS Vocabulary caMOD Cancer Models MAGE 1.2 Microarray data CSM Security Common Provenance, DBxrefs caTIES Pathology reports. gridPIR Protein Information caBIG™ UML Models Completed and in the Works at Cancer Centers for Silver Systems Filename/RPS Number
Cancer Data Standards Repository • ISO/IEC 11179 Registry for Common Data Elements – units of semantic metadata • Client for Enterprise Vocabulary: metadata constructed from controlled terminology and annotated with concept codes • Precise specification of Classes, Attributes, Data Types, Permissible Values: Strong typing of data objects. • Tools: • UML Loader: automatically register UML models as metadata components • CDE Curation: Fine tune metadata and constrain permissible values with data standards • Form Builder: Create standards-based data collection forms • CDE Browser: search and export metadata components Filename/RPS Number
Description Logic Enterprise Vocabulary Concept Code Relationships Preferred Name Definition Synonyms Filename/RPS Number
Semantic metadata example: Agent <Agent> • <name>Taxol</name> • <nSCNumber>007</nSCNumber> </Agent> Filename/RPS Number
Why do you need metadata? Filename/RPS Number
Computable Interoperability Agent C1708 Drug C1708 name id nSCNumber C1708:C41243 NDCCode CTEPName approvalDate FDAIndID approver IUPACName fdaCode C1708:C41243 My model Your model Filename/RPS Number
caCORE Architecture Clients Middleware Data HTTP Clients A P I Web Application Server Biomedical Data Interfaces Java SOAP XML A P I SOAP Clients Common Data Elements Domain Objects [Gene, Disease, etc.] Domain Objects [Gene, Disease, Agent, etc.] Data Access Objects A P I Perl Clients Enterprise Vocabulary Data Access Objects A P I Java Applications Authorization Filename/RPS Number
caCORE Software Development Kit Filename/RPS Number
caCORE SDK Components • UML Modeling Tool (any with XMI export) • Semantic Connector (concept binding utility) • UML Loader (model registration in caDSR) • Codegen (middleware code generator) • Security Adaptor (Common Security Module) caCORE SDK Generates a caBIG Silver-Compliant System Filename/RPS Number
From Silver to Gold:caGrid Filename/RPS Number
Silver Silver Silver Silver Gold Silver Silver Silver OTHER TOOLKITS NCI OTHER caBIG SERVICE PROVIDERS Cancer Center Cancer Center Cancer Center Cancer Center Cancer Center Filename/RPS Number
Use cases not satisfied by caCORE alone • Advertisement • Service Provider composes service metadata describing the service and publishes it to grid. • Discovery • Researcher (or application developer) specifies search criteria describing a service of interest • The research submits the discovery request to a discovery service, which identifies a list of services matching the criteria, and returns the list. • Invocation • Researcher (or application developer) instantiates the grid service and access its resources Filename/RPS Number
Mobius Globus BPEL GRAM Globus myProxy OGSA-DAI Globus Toolkit GSI CAS caCORE Globus caGrid Service-Oriented Architecture Functions Management Schema Management Metadata Management ID Resolution Workflow Security Resource Management Service Registry Service Service Description Grid Communication Protocol Transport OGSA Compliant - Service Oriented Architecture Filename/RPS Number
Grid Services • Two types of top-level grid services defined • Data Services: Respond to queries and return caBIG-compatible data objects • Analytical Services: Accept and process caBIG data objects, then return results that are also caBIG data objects. Filename/RPS Number
Test bed Infrastructure caGrid 0.5 Test Bed Filename/RPS Number
How can my research benefit from caBIG™ Tools? • Everything developed by the program is open source and freely available • Training is available at https://cabig.nci.nih.gov/training • The latest versions of all the software developed as part of the project can be obtained from the caBIG™ CVS site: • http://cabigcvs.nci.nih.gov/viewcvs/viewcvs.cgi/ • Commercial-grade documentation is provided as part of the project, which will be located at the project gforge site: • http://gforge.nci.nih.gov Filename/RPS Number
How can I get support for these tools? NCICB Applications Support will coordinate support for caBIG™ tools: • Live Support: Monday – Friday 8 am – 8 pm Eastern Time • Telephone support is available Monday to Friday, 8 am – 8 pm Eastern Time, excluding government holidays. • You may leave a message, send an email or submit a support request via the Web at any time. • Email: ncicb@pop.nci.nih.gov • Phone: 301-451-4384 • Toll-free: 888-478-4423 • Web: http://ncicbsupport.nci.nih.gov Filename/RPS Number
caBIG™: Getting Involved • To get involved with caBIG™: • Track caBIG™ activities on the caBIG™ website, https://cabig.nci.nih.gov/ • Attend caBIG™ Annual Meeting, April 9-11, 2006, Hyatt Regency Crystal City, Arlington, Virginia • Learn about the existing bioinformatics infrastructure, caCORE, at http://ncicb.nci.nih.gov/NCICB/infrastructure • Download currently available caBIG™ tools from the caBIG™ website at https://cabig.nci.nih.gov/inventory • Sign up for the caBIG™ mailing list at http://list.nih.gov/archives/cabig_announce.html Filename/RPS Number
From Village to City Filename/RPS Number
Acknowledgements • NCI • The whole institute is behind caBIG! • NCI Center for Bioinformatics • The staff and contractors at the NCICB are managing the daily program, and creating many of the tools described here. • http://ncicb.nci.nih.gov Filename/RPS Number
Acknowledgements – Current caGrid Team • Ohio State University Cancer Center • Providing Primary Technical Leadership, under Joel Saltz • SAIC • Booz-Allen-Hamilton • University of Chicago Cancer Center/Argonne NL • Georgetown/Lombardi Cancer Center • caBIG Architecture Workspace Members Filename/RPS Number
For more information, contact • Peter Covitz, Ph.D • Chief Operating Officer • NCI Center for Bioinformatics • National Cancer Institute • National Institutes of Health, US DHHS • 6116 Executive Blvd. - # 705 • Rockville, MD 20852 • 301-451-4385 • covitzp@mail.nih.gov Filename/RPS Number