1 / 45

Gold Compatibility Criteria and Review Process

Gold Compatibility Criteria and Review Process. Robert Freimuth, Salvatore Mungal, Scott Oster, Lynne Wilkens Daniela Smith, Michael Keller. caBIG Annual Meeting Washington, D.C. June 25, 2008. Agenda. Introduction Overview of Gold Criteria Information Models Common Data Elements

ghada
Download Presentation

Gold Compatibility Criteria and Review Process

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Gold Compatibility Criteria and Review Process Robert Freimuth, Salvatore Mungal, Scott Oster, Lynne Wilkens Daniela Smith, Michael Keller caBIG Annual Meeting Washington, D.C. June 25, 2008

  2. Agenda • Introduction • Overview of Gold Criteria • Information Models • Common Data Elements • Vocabularies • Program Interfaces • Summary • Q&A

  3. Working Groups Programming & Messaging Interfaces Tahsin Kurc, OSU Patrick McConnell, Duke University Scott Oster, OSU Andy Pople, University of Pittsburgh Common Data Elements Dianne Reeves, NCI CBIIT Baris Suzek, Georgetown Lynne Wilkens, University of Hawaii Information Models Bob Freimuth, Mayo Clinic Lewis Frey, University of Utah Rakesh Nagarajan, Washington University • Vocabularies • Jim Buntrock, Mayo Clinic • Sal Mungal, Duke University • Craig Stancl. Mayo Clinic • Stuart Turner, UC-Davis • Larry Wright, NCI/OC • Working Group Facilitators • Brian Davis, 3rd Millennium • Michael Keller, BAH • Daniela Smith, BAH • NCICB Facilitators • George Komatsoulis • Avinash Shanbhag

  4. IntroductionLevels of Maturity • Legacy • System does not meet any of the requirements for interoperability • Implies no interoperability with an external system or resource • Bronze • Minimum requirements for a basic degree of interoperability • Silver • Rigorous set of requirements that significantly reduce the barrier to use of a resource by a remote party who was not involved in the development of that resource • Gold • Full semantic interoperability of disparate systems • Formalized grid architecture and data standards • Advertising, discovery, and use of all federated caBIG resources

  5. Introduction • The need to incorporate additional criteria into the “Gold” maturity level of the caBIG™ Compatibility Guidelines is being driven by: • The release of the caGrid 1.0 Infrastructure • The experience of the caGrid 1.0 reference implementation projects • The experience of early adopters of caGrid 1.0 • The work done on UML model harmonization • The work done on vocabulary standards • The work done on CDE re-use

  6. IntroductionCompatibility Guidelines v3.0 • Released May 1, 2008 • Gold Compatibility requirements • Information Models • Data Elements • Vocabularies/Terminologies & Ontologies • Programming and Messaging Interfaces • Clarification/revision of Silver Compatibility requirements • Available through the caBIG web site: https://cabig.nci.nih.gov

  7. Compatibility Guidelineshttps://cabig.nci.nih.gov

  8. IntroductionDevelopment of Gold Criteria • Goal • Development and release of the Gold compatibility criteria and review process • Approach • Kick off Working Group composed of XCWS participants (Jan 2008) • Sub-groups focused on: Interfaces, CDEs, Vocabularies and Information Models • Each sub-group generated review criteria based on Compatibility Guidelines v3.0 • Sub-Group Leads are identifying overlap and harmonizing the checklists (June) • The Sub-Group Leads will also develop a review process (June) • The review process and review criteria will be piloted (July) • Lessons learned will be captured • Review criteria and process will be updated as needed

  9. Agenda • Introduction • Overview of Gold Criteria • Information Models • Common Data Elements • Vocabularies • Program Interfaces • Summary • Q&A

  10. Information ModelsFrom Silver to Gold • Silver level • Criteria for UML modeling • Criteria for semantic annotation • Gold level • Criteria to meet infrastructure requirements • Criteria to promote interoperability

  11. Information ModelsCriteria for Semantic Annotation • Criteria for semantic annotation is the same as for Silver level • UML names should accurately convey their meaning and be consistent with the definition. • This might be upgraded to "absolute" (TBD) • UML definitions should accurately describe what they represent and be sufficiently clear to enable accurate semantic annotation. • The concepts assigned to each class and attribute must be synonymous (consistent) with the developer-derived UML definition.

  12. Information ModelsOverview of Criteria • Criteria to meet infrastructure requirements (caGrid) • The entire IM must be fully represented in the XML schema • Names for classes and attributes (TBD) • Criteria to promote interoperability • Model harmonization and reuse • Analytical services: emphasis on reuse of whole classes for input objects, output objects, and parameters • Data services: emphasis on reuse of components from the backbone model

  13. Information ModelsOverview of Criteria • Criteria for reuse • Classes and attributes • Focus on the backbone model • Including associations, if applicable • Datatypes • Reused, or caBIG-approved • Enumerated value domains • Included in the model • Modeled same way, if reused • Reused components must be appropriate in the context of the information model and accurately capture the semantic meaning of the underlying data.

  14. Information ModelsOverview of Criteria • Preferred order of reuse*: • CDEs from the Backbone Model • Standard CDEs that are not in the Backbone Model • CDEs from existing Gold level applications • CDEs from existing Silver level applications • CDEs registered in the caDSR * For the Information Models criteria, "CDE" refers to the corresponding class/attribute pair in the UML model

  15. Information ModelsOverview of Criteria • Full reuse of classes and attributes from the backbone model is required • Justification and examples of interoperability are required otherwise • If a new CDE is required, partial reuse should be maximized • See order of preference • Justification and examples of interoperability are required in some cases • Developers will contribute to the evolution of the backbone model* • Expansion with new attributes • Extension with new classes/attributes • Requires data to be a specialization of an existing class in the backbone model • Justification is required if similar, existing classes in the backbone model cannot be expanded or extended * Contributions will occur indirectly and through a controlled process

  16. Information ModelsChanges to the Submission Package • CDE reuse report • Breakdown by the source of each component • UML model reuse report • Identifies reuse of and deviations from the backbone model • Includes a list of "whole class" reuse for analytical services • Examples of interoperability (grid joins) • Illustrates use cases for how the application will interoperate with existing applications • Link to the XML schema

  17. Information ModelsIssues for Future Discussion • Review the need for additional criteria • Content of enumerated VDs (PVs) • Non-enumerated VDs (number ranges, string character limits, etc) • Currently there is no way to represent this in the UML model • Semantics of associations • List of approved datatypes • Use of approved datatypes is required at Gold level • Evolution of the backbone model • When the backbone model is extended, do the new attributes belong in the base class or in a child class? • Process for versioning the backbone model • Clarify the requirements for existing Gold applications when the backbone model is revised

  18. Information ModelsResource Requirements • Tooling needs • Enumerated VDs • Should be included in the model under review, and should also map to an existing VD in the caDSR • This will require two models, one for review and one for loading • Non-enumerated VDs (number ranges, string character limits, etc) • Include this information as tagged values? • Creation of reports for the submission package • CDE reuse report, UML model reuse report, vocabulary report, etc • Correspondence checks • UML model = XMI = caDSR = API = API docs = XML schema • Process requirements • May require more engagement by mentors to ensure that the backbone model is considered for reuse early in development

  19. Agenda • Introduction • Overview of Gold Criteria • Information Models • Common Data Elements • Vocabularies • Program Interfaces • Summary • Q&A

  20. Common Data Elements • Basis of CDE criteria • Elements must be well-formed and defined in order for data to be joined and shared • Silver level compatibility criteria • Ensured that ISO 11179 criteria are met • Paring of data element concept (Object + property) and Value Domain • Registration in caDSR, an ISO 11179 CDE repository • Required that all administered elements have good semantics • Names are appropriate • Definitions exist and are clear • NCI Thesaurus codes in caDSR • Consistency between UML model and caDSR definitions • Gold level compatibility criteria • Focus on re-use of CDE standards and existing CDEs to facilitate interoperability

  21. Common Data Elements • CDE Criteria in Gold Compatibility Matrix • CDEs designated as caBIG Standards by the VCDE workspace must be used as appropriate. • CDEs generated from the Backbone Model must be re-used as appropriate. • Existing validated CDEs in the caDSR must be re-used or otherwise justified before any new data elements are created. • Data elements must be expressed in caGrid standard metadata format • The data elements used by the service as part of its operations must be fully described in the caGrid metadata to facilitate effective discovery, advertisement and interoperability.

  22. Common Data Elements • caBIG™ Standards by the VCDE workspace • About 20 CDE packages including 140 CDEs have been approved by VCDE (https://gforge.nci.nih.gov/frs/?group_id=109) • Registration status on caDSR of Standard • Examples: sex/gender, age, address, performance status • Re-use is absolute requirement • Full re-use requires use of all administered elements: object/class, property/attribute, value domain, permissible values • Violation requires strong justification (regulatory requirements) • caBIG™ Backbone Model • CDEs in Backbone Model will be registered in caDSR • Will be promoted as standards • Re-use is absolute requirement • Validated CDEs • Work flow status of Released in caDSR • Provide list of those reviewed • Provide explanation of lack of re-use

  23. Common Data Elements • Other Re-use Issues • For CDEs that would logically have a list of permissible values (PVs), enumeration or enumeration by reference is required • List of PVs with definitions must be available to facilitate re-use • For newly created CDEs, partial re-use of administered elements of standards is encouraged to facilitate interoperability • Data Element Concept re-use (Object/class + Attribute/Property) • Value Domain re-use

  24. Common Data Elements • caGrid service metadata • XML document must be provided • Concept codes for object, properties, value domain and permissible values in XML document must agree with caDSR mapping • Requirement is absolute

  25. Common Data Elements • Tooling needs • Tooling to compare concept codes between service metadata, UML model and caDSR • Report of use of concept codes in application under review within caDSR • Report of use of similar concepts based on name • Helpful if caDSR could identify “like” CDEs • In particular, CDEs with the same attributes except that object class of person is substituted with patient or participant

  26. Agenda • Introduction • Overview of Gold Criteria • Information Models • Common Data Elements • Vocabularies • Program Interfaces • Summary • Q&A

  27. Vocabularies • Vocabulary Criteria will map to: • Full adoption of caBIG vocabulary standards as approved by the VCDE workspace. • Concept identification in systems must use the caBIG Identifier and Resolution Scheme • Metadata of vocabularies must be accessed through a standard caGrid Vocabulary API • Vocabularies must be discovered through a standard caGrid Vocabulary API

  28. Vocabularies • Questions/Challenges to Address: • Will many vocabularies really satisfy our vocabulary review criteria for caBIG™ Vocabulary standards? • How can we develop review criteria around Concept IDs and their resolution on caGrid as this is still under development? • What is the relationship of vocabulary standardization process to the Silver/Gold Compatibility Guidelines?

  29. Changes in the Compatibility GuidelinesSilver Vocabulary and Ontologies Matrix Differentiated from bronze and silver where all data collection fields and attributes of data objects are approved by caBIG™ VCDE Workspace Vocabularies used in data elements should be compatible with caBIG Identifier and Resolution Scheme Approved vocabularies will provide a minimum set of core metadata Approved vocabularies will be classified based on scope, intent, and purpose

  30. Changes in the Compatibility GuidelinesSilver Vocabulary and Ontologies Text Updated to reflect review checklist Vocabularies/Ontologies will be assessed via LexEVS on the grid (LexBIG and EVS will merge under the name of LexEVS) Added description of caBIG Identifier Scheme for semantic classes Added description of the caBIG Identifier Resolution Scheme for resolving identifiers

  31. Changes in the Compatibility GuidelinesGold Vocabulary and Ontologies Matrix Differentiated from Silver based on usage of common identifier scheme and common vocabulary API Full adoption of approved caBIG vocabulary standards Vocabularies will utilize the caBIG identifier and Resolution Scheme Vocabularies will be accessible through a standard vocabulary API Compatible systems will reference standard vocabularies approved for use by gold systems

  32. Changes in the Compatibility GuidelinesGold Vocabulary and Ontologies Text Added detail on gold compatibility Approved caBIG vocabulary standards are enabled Registered terminologies approved as caBIG standards for caBIG usage are accessed via terminology metadata and discovered through a caGrid vocabulary service (caGrid Vocabulary API) Vocabulary is accessible through a standard caGrid vocabulary API The current caGrid Vocabulary API is EVS. LexBIG and EVS will merge under the name of LexEVS

  33. Gold Vocabularies • Tooling needs • Tooling to confirm correspondence between concept IDs, names, and/or definitions in the Information Model with the source terminology. • Tooling to confirm correspondence between concept IDs, names, and/or definitions in the Service Metadata with the source terminology. • These tools will not be implemented until the Resolution Scheme is fully implemented

  34. Changes in the Compatibility GuidelinesGold Vocabulary and Ontologies Text Recap - Gold Vocabulary and Ontologies are: Accessed and discovered via caGrid services Provided with a standard set of metadata Mapped and implemented with caBIG Identifier and Resolution Scheme Classified based on scope, purpose and intent Creation of tools needed to help the vocabulary review process

  35. Agenda • Introduction • Overview of Gold Criteria • Information Models • Common Data Elements • Vocabularies • Program Interfaces • Summary • Q&A

  36. Interfaces Interface Criteria to map to: APIs are exposed as operations of a Grid service; Object-Oriented client APIs are available for invoking those operations Service operations use XML as data exchange format, and are invoked using standardized protocols and communication channels Services provide public access to caGrid standardized service metadata and have capability to register it with a caGrid Index Service Data-oriented services provide query access using the caGrid standardized query interface and language Secure services must use the caGrid standardized mechanisms for authentication, trust management, and communication channel protection Questions/Challenges to Address: What is the distinction between reviewing a Silver API vs. a Gold API? Tooling for tedious “consistency checking” of various artifacts How will schemas in the GME be mapped to UML models?

  37. The primary change for Gold APIs is the move to grid services and APIs, where data is transported over the grid as well-defined XML Tooling exists to make the development experience very similar to any existing Silver API which is client/server based However, Gold compliance additionally requires: Adherence to several standards and specifications Standardized approaches to metadata and security Specific (additional) constraints for data query capabilities Review process will focus on checking for standards compliance and consistency between existing artifacts (UML models, APIs, etc) and new grid-specific artifacts (WSDLs, XSDs, service metadata, etc) Changes in the Compatibility Guidelines Gold API (Grid Services)

  38. Gold compliance introduces the concept of “service metadata” to all systems which are exposed as grid services Provides programmatic runtime access to metadata about the API, information model, CDEs, and vocabulary Tooling exists to automate the development experience such that most information is extracted from existing system (e.g. caDSR) and metadata is created automatically Review process will focus on checking for existence, syntax compliance, accuracy, registration, and consistency of metadata Changes in the Compatibility Guidelines Gold API (Metadata)

  39. Gold compliance unifies standards, technologies, and methodologies for authentication, authorization, message transport, and trust Built upon X.509, HTTPS, and web/grid service standards Tooling exists to simplify accrual and use of credentials, management of trust, and service security configuration Review process will focus appropriate use of authentication process, integration to caBIG trust fabric, and use of standards and technologies for transport Changes in the Compatibility Guidelines Gold API (Security)

  40. Gold compliant applications are expected to correctly leverage (secure) grid services, make use of the discoverable nature of the grid, present data using registered semantics, and build on existing tooling/languages/APIs when possible Many high-level APIs and frameworks exist for application developers to leverage Review process will focus use of the grid APIs and tools, presentation of data, and integration with security infrastructure Changes in the Compatibility Guidelines Gold API (Applications)

  41. Gold compliance introduces many new artifacts (grid service, metadata, WSDL, XSDs, etc) Must be checked for consistency with each other and existing Silver artifacts (UML models, APIs, etc) Magnitude of items to check practically requires automation Review process is complicated if existing Silver review has been performed, or if both Silver and Gold compliance are eventually sought Large systems with many components seeking reviews may have fuzzy boundaries (e.g. an application consisting of a UI and 1 or more grid services) Some criteria are most ideally realized by emerging technology or software in development (e.g. caDSR/GME binding, identifiers, distributed vocabulary services, etc) Gold criteria groups have more overlap/dependencies than Silver criteria Changes in the Compatibility Guidelines Gold API Review Challenges

  42. Agenda • Introduction • Overview of Gold Criteria • Information Models • Common Data Elements • Vocabularies • Program Interfaces • Summary • Q&A

  43. SummaryGold Criteria • Information Models • Model harmonization and reuse • UML model represented in the XML schema • CDEs • Reuse of standards • Concept codes in the XML document • Vocabularies • Use of vocabulary standards • caBIG identifier and Resolution Scheme • Program Interfaces • Grid services and security • Service metadata

  44. Next Steps • Kick-off Working Groups • Finalize individual checklists • Sub-Group Leads harmonize checklists • Sub-Group Leads develop review process (based on Silver) • Working Group signs off on harmonized checklists and process • Pilot review process with gridPIR (July 2008) • Develop lessons learned • Modify review criteria and process as needed • Present to Architecture and VCDE WS for review and approval

  45. Agenda • Introduction • Overview of Gold Criteria • Information Models • Common Data Elements • Vocabularies • Program Interfaces • Summary • Q&A

More Related