340 likes | 571 Views
FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot. Brand Niemann (US EPA), Chair, Semantic Interoperability Community of Practice (SICoP) Best Practices Committee (BPC), CIO Council December 28, 2005 http://web-services.gov/ and http://colab.cim3.net/cgi-bin/wiki.pl?SICoP
E N D
FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot Brand Niemann (US EPA), Chair, Semantic Interoperability Community of Practice (SICoP) Best Practices Committee (BPC), CIO Council December 28, 2005 http://web-services.gov/ and http://colab.cim3.net/cgi-bin/wiki.pl?SICoP http://colab.cim3.net/cgi-bin/wiki.pl?DRMImplementationThroughIterationandTestingPilotProjects
Preface • Conceptual Data Model – a model to guide data architecture and not a model to guide database development. • But an ontology provides both and the pilot is both a CDM and an executable application based on DRM 2.0! • So Data Architecture can be implemented in ontology-driven information systems. • See next slide.
Preface • Ontology-Driven Information Systems: • Methodology Side – the adoption of a highly interdisciplinary approach: • Analyze the structure at a high level of generality. • Formulate a clear and rigorous vocabulary. • Architectural Side – the central role in the main components of an information system: • Information resources. • User interfaces. • Application programs. See for example: Nicola Guarino, Formal Ontology and Information Systems, Proceedings of FOIS ’98, Trento, Italy, 6-8 June 1998.
Preface • Health Information Technology CoP’s Health Information Technology Ontology Project (HITOP)* Major Objectives and Examples: • 1. Ontology-graph assisted search of medical literature: SemanTxLife Sciences Pilot. • 2. Ontology in major health standards development: Barry Smith - HL7 RIM. • 3. Ontology in the FHA Data Architecture Work Group: Brand Niemann – DRM 2.0 Pilot. • 4. Ontologies in bioinformatics: Ken Baclawski – Book and Keynote Presentation. • 5. Ontologies in operational clinical systems: Mark Musen – Stanford Medical Informatics. • 6. Ontologies in large –scale medical research systems: Connor Skankey – Visual Knowledge BioCAD. * Marc Wine, GSA Office of Intergovernmental Solutions, CoP Lead
Preface • Question: Should the FHA DAWG be overly focused on metadata? • Metadata and data are integrated together in DRM 2.0 and the pilot. • Question: Should FHA DAWG work with unstructured or semi-structured data or defer this task to partners/agencies? • All three types of data are integrated together in DRM 2.0 and the pilot. • Question: Should FHA DAWG also add physical data modeling to methodology? • The DRM ITIT Pilot shows how both conceptual and physical data are done together with ontologies. • Question: Should educational material on metadata and data modeling be present in the Data Strategy? • DRM 2.0 put educational material in the DRM Reference Model and ITIT Wiki Pages, not the Reference Model Document itself.
Preface • Question: Should we align more closely to FEA DRM? • Aligning with DRM 2.0 adds credibility to the work and pilot specifically demonstrates the three components of DRM 2.0. • Question: How detailed of a level of analysis can be performed by the FHA DAWG? • This depends on the level of detailed data and information that the FHA partners are willing to expose, e.g. the pilot uses summary data that is in the public domain. • Question: Does the FHA DAWG analyze only (discover) or does it prescribe a solution (recommendation) like semantic harmonization scenarios? • SICoP and DRM ITIT are concerned with achieving semantic harmonization and interoperability. E.g., the suggestion to include the CHI vocabularies in the pilot should be implemented.
Overview • 1. The New Data Reference Model 2.0 • 2. Health, United States, 2005 • 3. Data Architecture Working Group • 4. Pilot Project • Appendix. Other Related Work
1. The New Data Reference Model 2.0 • The FEA framework and its five supporting reference models (Performance, Business, Service, Technical and Data) are now used by departments and agencies in developing their budgets and setting strategic goals. With the recent release of the Data Reference Model (DRM), the FEA will be the “common language” for diverse agencies to use while communicating with each other and with state and local governments seeking to collaborate on common solutions and sharing information for improved services. Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3. http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf
1. The New Data Reference Model 2.0 • The following chart illustrates the potential uses of the newly released DRM Version 2.0: • The FEA mechanism for identifying what data the Federal government has and how it can be shared in response to a business/mission requirement. • The frame of reference to facilitate Communities of Interest (which will be aligned with the Lines of Business) toward common ground and common language to facilitate improved information sharing. • Guidance for implementing repeatable processes for sharing data Government-wide. Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3. http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf
1. The New Data Reference Model 2.0 Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3. http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf
FEA Reference Model Taxonomies FEA “Common Language” DRM 1.0 by committee Implementation after development. FEA Reference Model Ontology FEA Semantic Model DRM 2.0 by open, collaborative process Implementation though iteration and testing during development. 1. The New Data Reference Model 2.0 Paradigm Shifts
1. The New Data Reference Model 2.0 • Original FEA Lines of Business (6): • Data and Statistics: • Opted out because of FedStats, Federal Committee on Statistical Methodology, etc. (it had its act together for statistical data management) • Now it’s back with: • The new Data Reference Model 2.0 because statistical programs generally have the best data and metadata and data management practices. • The National Infrastructure for Community Statistics Community of Practice (NICS CoP) • The Federal Health Architecture Data Architecture Working Group because FHA agencies are statistical agencies: • See for example Health, United States, 2005 from the National Center for Health Statistics!
1. The New Data Reference Model 2.0 Relationships and associations • Metamodel: Precise definitions of constructs and rules needed for abstraction, generalization, and semantic models. • Model: Relationships between the data and its metadata. • Metadata: Data about the data. • Data: Facts or figures from which conclusions can be inferred. Source: Professor Andreas Tolk, August 16, 2005 The purpose of this schematic is to show that we need to describe information model relationships and associations in a way that can be accessed and searched.
1. The New Data Reference Model 2.0 The point of this graph is that Increasing Metadata (from glossaries to ontologies) is highly correlated with Increasing Search Capability (from discovery to reasoning).
1. The New Data Reference Model 2.0 • Five Key Activities Over the Next Year: • 1. Education and Training in DRM Version 2.0 and use in FEA – DRM-based Information Sharing Pilots (started June 13th). • 2. Testing of XML Schemas and OWL Ontologies by NIST and the National Center for Ontological Research, respectively, among others (began October 27th). • 3. Inventory/Repository of Semantic Interoperability Assets and Development of a Common Semantic Model (COSMO) by the new Ontology and Taxonomy Coordinating Work Group (ONTACWG) (started October 5th). • 4. Continued early implementation of DRM 2.0 concepts and artifacts by industry in “open collaboration with open standards” pilot projects and workshops (started July 19th). E.g. FHA/DAWG. • 5. Fostering champions of DRM Best Practices to improve (1) agency data architectures within agencies and (2) cross-agency data sharing across agencies in funded projects (started June 13th).
1. The New Data Reference Model 2.0 Where is SICoP DRM Implementation Going? Super Pilot: Address as Many Boxes as Possible! Yes ? ? CoP: Community of Practice LoB: Line of Business EAAF: OMB Enterprise Architecture Assessment Framework 2.0 FHA/DAWG: Federal Health Architecture – Data Architecture Working Group
1. The New Data Reference Model 2.0 • December 5-7, 2005, Knowledge Management Collaboration & Knowledge Sharing Conference, Orlando, Florida: • Using CoPs To Simplify Processes and Unify Work Across Agencies: Cross-Industry Applications: • Semantically Enabled Content (Wiki Purple Numbers, Ontology Modeling Before Content is Created-e.g., SiberLogic, Repurposed Content, etc.) • December 13, 2005, Invited Presentation to the Federal Metadata Management Consortium (FMMC): • SICoP and DRM Implementation Through Iteration and Testing Work: Making It Real: • Semantic Knowledge Modeling and a Knowledge Reference Model for Implementing the Semantic Web in the Federal Government.
2. Health, United States, 2005 • 156 tables in Excel plus 37 tables in Excel for figures • Metadata (multiple levels and types) • For tables • Sources of data • Data stories • Definitions - 194 • Repurpose this excellent content and model and map it to the DRM 2.0.
2. Health, United States, 2005 http://www.cdc.gov/nchs/hus.htm
3. Data Architecture Working Group Source: FHA Data Strategy, DRAFT V1.0, December 28, 2005. CDM: Conceptual Data Model.
3. Data Architecture Working Group • The scope of the Data Architecture Working Group is to help partner agencies to ensure that the FHA and its partners have a comprehensive and accurate view of the data needs of the FHA and to collect, store, and access the metadata in a consistent way. This charter extends to all Federal Departments whose mission is to provide and/or support the delivery of health care services that have been recommended and accepted. The Data Architecture Work Group also focuses on health metadata collection, analysis, and planning activities that are supported by the FHA Partner Council. The DAWG, as it pursues its data architecture objectives, will coordinate these activities with the other established workgroups of the FHA. Source: FHA Data Architecture Working Group Initial Kickoff Meeting, December 13, 2005.
4. Pilot Project • DRM 2.0: • Description (slides 24 and 27): • Metadata (Title, Data, Notes, and Sources) • Data Story • Definitions and Methods • Context: • Taxonomy and Search (slides 25-26) • Sharing: • Separation of Presentation and Data (slides 28-29)
4. Pilot Project This Data Architecture Provides the Three S’s: Structure, Searchability, and Semantics. See http://web-services.gov and Dynamic Knowledge Repositories
4. Pilot Project Federated Search of All FHA Taxonomy Nodes See next slide for explanation. Query of HUS 2005 Taxonomy Nodes
4. Pilot Project • Query of HUS 2005 Taxonomy Nodes: • This is the Expert Search Form Interface in the Web Browser where the (1) left pane has the hierarchical table of contents structure in the left pane where the document (s) and their subsections are selected for search and the (2) right pane has the boxes for the actual search query terms (“IDC codes”), number of words about the highlighted search terms that are desired (none), the search execution button, and the query syntax explanation. • Federated Search of All FHA Taxonomy Nodes: • This is the same as item 2 above, except that a different set of boxes are checked in the (1) left pane (the entire FHA Node) and a different query (“data architecture”) and number of words about the highlighted search terms that are desired (five) are used in the (2) right pane.
4. Pilot Project Recall Slide 8 Data Story Metamodel Model Metadata Data Note: Can Highlight Table and Copy and Paste to Spreadsheet Because of XML Markup.
4. Pilot Project Separation of the Data Presentation from the Data & Metadata. Data & Metadata (see next slide) Data Presentation/ Visualization http://web-services.gov/statabs2003no1.htm
4. Pilot Project The Data & Metadata Travel Together in XML Format! Data & Metadata in XML http://web-services.gov/statabs2003no1.htm
4. Pilot Project • Federal Health Architecture Data Architecture Ontology Metamodel: • Use Lines of Business/Business Reference Model to Define the “Upper Ontology”: • See slide 31. • Use Data Elements to Define the Domain-Specific Ontology: • See slide 32.
Appendix. Other Related Work • Building an Ontology of the National Health Information Network (NHIN): Status Report: • http://web-services.gov/nhinrfiontology04052005.ppt • http://ontolog.cim3.net/cgi-bin/wiki.pl?NhinRfi • http://ontolog.cim3.net/cgi-bin/wiki.pl?HealthOntologyMapping • Collaborative Expedition Workshops (examples): • December 9, 2004, Standard Vocabularies in Health Care, Kathy Lesh, Kevric. • July 19, 2005, Building a Hospital Incident Reporting Ontology (HIRO) in the Web Ontology Language (OWL) Using the JCAHO Patient Safety Event Taxonomy (PSET), Liju Fan, Kevric, et al. • December 6, 2005, Introduction to the Semantic Web for Bioinformatics, Ken Baclawski, Northeastern University. • December 6, 2005, Boston Children's Hospital "smart search" and Semantic UMLS Ontology-based Professional Language Processing PubMed Search, Michael Belanger, President, SemanTxLife Sciences. • See http://colab.cim3.net/cgi-bin/wiki.pl?ExpeditionWorkshop
Appendix. Other Related Work • Health Information Technology Ontology Project (HITOP): • New CoP Led by Marc Wine, GSA Office of Intergovernmental Solutions: • Develop a roadmap on the state-of-the-art use of ontology tools to achieve semantic interoperability for high priority health IT applications involving clinical decision support systems (DSS) and electronic health records (EHRs). • See http://colab.cim3.net/cgi-bin/wiki.pl?HealthInformationTechnologyCommunityofPractice • Fourth Semantic Interoperability for E-Government Conference, February 9-10, 2005, Work Group Reports: • Featured Presentation: Barry Smith, HL7 RIM • See http://colab.cim3.net/cgi-bin/wiki.pl?FourthSemanticInteroperabilityforEGovernmentConference_2006_2_0910