1 / 21

Tong Yu HCLS 2008

Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine. Tong Yu HCLS 2008. Content. The TCM Semantic Web Semantic Query and Search Portal Semantic Graph Mining Methods TCM Use Cases

plucille
Download Presentation

Tong Yu HCLS 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS 2008

  2. Content • The TCM Semantic Web • Semantic Query and Search Portal • Semantic Graph Mining Methods • TCM Use Cases • Key Benefits of Using Semantic Web Technologies for TCM domain

  3. TCM Informatics aims at the computerization of TCM information and knowledge to provide intelligent resources for clinical decision-making, drug discovery, and education. TCM Informatics is essentially an interdisciplinary endeavor involving Chinese Culture, Healthcare and Life Sciences, and Information Technology The cross-cultural and interdisciplinary nature of TCM Informatics requires data from interrelated domains to be connected and shared. Tcm Informatics: A cross-cultural and interdisciplinary endeavor

  4. We intend to connect the knowledge systems of TCM and biomedicine to facilitate cross-cultural information retrieval and data analysis. Engineer an ontology for TCM domain Making associations between TCM Ontology and Western Medicine Ontology The Semantic Web integrates structured data from the territories of TCM (Left) and Western Medicine (Right) Semantic Mediator maps relational schemas into domain ontology by defining Semantic Views. Query-Rewriting Engine translates a Sparql query into a series of SQL queries based on mapping rules. supports a variety of Web-based applications. Approach Overview

  5. The Architecture of The TCM Semantic Web

  6. The TCM Semantic Web portal

  7. A Semantic Graph Model for TCM Domain Semantic Graph Model can connect data from different TCM data sources while preserving the provenance of data. We use TCM Ontology to integrate data about EMR, Formulae & Drugs, Diseases, and to connect the TCM data with orthodoxy medicine data e.g. UMLS, Gene Ontology.

  8. Interactive Mining of TCM Knowledge The Spora System perform interactive knowledge discovery experiments on the Semantic Web. Semantic Graph Mining Implement the semantic graph mining algorithms (importance calculation, frequent pattern discovery, clustering, etc. ) as generic operators that work on top of the Semantic Web layer and query semantic graph models in Sparql. KDD Experiments Users can create an Experiment by specifying a knowledge discovery process as a tree of operators with customizable properties, and then execute the process and review the results rendered as interactive tables, histograms, etc.

  9. Semantic Graph Resource Importance • the in-degree centrality CI of a resource is measured by the weighted sum of statements with the resource as object, and the out-degree centrality is measured by the weighted sum of statements with the resource as subject.

  10. Semantic Graph Resource Importance • The Closeness Centrality of a resource r is defined as the inverse of the sum of the distance from r to all other resources.

  11. Semantic Graph Resource Importance • The Betweenness Centrality of a resource r is defined as the ratio of shortest paths across the resource in the graph.

  12. Semantic Associations • pathAssociated • <the prescription1 prescribes TCM Formula FGD> AND <Formula FGD cotains the Herb Glycyrrhizae>, • So that: • <prescription1 &Glycyrrhizae are pathAssociated>. • joinAssociated • <the prescription1 prescribes a Formula FGD> to <treat the TCM Syndrome KYD>, • So that: • <FGD & KYD are joinAssociated> with the join point as the prescription1. • classPathAssociated • <the Glycyrrhizae is of type Herb>AND <the Atractylodis is of type Drug>AND <Herb is a subclass of Drug>, • So that: • <Glycyrrhizae and Atractylodis are –cpAssociated>.

  13. Frequent Semantic Subgraph

  14. Frequent Semantic Subgraph

  15. Pattern Interpretation • Discovered patterns can be annotated with domain knowledge based on semantic associations of concepts, and visualized as a rich graph to facilitate human interpretation. Here semantic search is used to discover latent semantic associations of concepts.

  16. Pattern Interpretation • This example pattern including four herbs and two drug efficacies, is interpreted by the fact that the formula FGD composed of these herbs has these two drug efficacies.

  17. The Semantic Network of herb-drug interactions • The TCM domain involves a complex network of drug-interactions. • We use Traditional Chinese Medicine (TCM) information resources to map an extensive view of Herb-Drug Interactions. • This network is mapped through semantic integration of legacy relational databases in Traditional Chinese Medicine (TCM) domain. • This network is used for domain experts to rank topologically-important herbs/drugs, to retrieve semantic associations between drugs, and to discern interesting patterns such as frequent sub-graphs and community structures.

  18. Data Modeling: Represent domain knowledge and facts in named semantic graphs. Data Transformation& Integration : translate structured or semi-structured data into semantic web languages. Entity Disambiguation: 1344 CVD-associated herbs are identified. Interaction Identification: The nature and frequency of interactions between all pairs of drugs are discovered through semantic association. A semantic network is generated by inserting a statement for every interaction to generate a global semantic graph of diverse drug interactions . Clustering: Drug communities are discovered through semantic graph clustering. The Semantic Network of herb-drug interactions: The process

  19. Global network of frequent herb-drug interactions,with drugs represented by nodes with size/font proportional to degree, interactions represented by edges, and drug communities represented by colors.

  20. Key Benefits of Using Semantic Web Technology • Exposing of legacy data through a semantic layer so that it can be more easily reused and recombined.  • Linking data across database boundaries so as to enabling more intuitive query, search, and navigation without the awareness of the boundaries.  • The ontology serves as the control vocabulary to make semantic suggestions such as synonyms, related concepts to facilitate query and search. • Reasoning capability such as sub-classing, transitive property can then be implemented at the semantic layer to increase the query expressiveness so as to retrieve more complete answers. • Allows for more advanced data analysis and integrative knowledge discovery based on the huge web of data.  

  21. Conclusion • We took the first systematic approach to leverage the progress of Biomedical Informatics to address the modernization of TCM. • Domain experts evaluate the platform’s major technical features as original and productive in Drug Safety and Efficacy analysis. • This case study demonstrates the Semantic Web’s advantages in representation, integration, and discovery of knowledge with complex domain models. • Contributes to the Preservation and Modernization of TCM as intangible cultural heritage.

More Related