280 likes | 634 Views
The Ultralink – an expert system for contextual hyperlinking in knowledge management Manuel C. Peitsch Head of Systems Biology Novartis Institutes for Biomedical Research. A Knowledge Space. Connecting the Knowledge Bodies (requirements).
E N D
The Ultralink – an expert system for contextual hyperlinking in knowledge managementManuel C. PeitschHead of Systems BiologyNovartis Institutes for Biomedical Research
Connecting the Knowledge Bodies (requirements) • Intelligent integration of heterogeneous data to enable “Seamless Navigation”: • One-stop shop. • Re-useable, in any Web and Office application. • Intelligent, i.e. knows about biology, medicine, chemistry, diseases, business, people, etc… • On demand and easy to use. • Configurable.
Connecting the Knowledge Bodies (Components) • Indexing of large heterogeneous data collections (databases, full texts) to enable semantic expansion. • Information Retrieval and Extraction, entity recognition, semantic enrichment. • Knowledge Map (navigating the conceptual network). • Terminology Hub (thesauri and ontologies). • Ontology-associated business rules.
Searching a term in source A and B may lead to differentresults although the underlying concepts exists in both sources (false negatives in IR and IE) Creating references (Terminology Hub) • Different knowledge repositories have different ways to encode a concept: • Registry Number • Unique Internal ID • Concept Identifier • Enumerating terms • Just using different terms without any constraints Over 8 GB of cross-referencing information • Terminology Hub ensures coherent mapping • Between coding systems • Between different representation levels (e.g. ID vs. Concept) • Between local terms and global terms
What entities constitute our Terminology? • Chemical entities – IUPAC names, trivial names, trade names, INNs, compound codes, ligands. • Biological entities – targets, genes/protein, modes of actions… • Diseases, Indications, Side Effects, Contraindications • Institutions, Affiliations, People • Geographic locations • …
The UltraLink : a revolutionary tool to navigate the “Knowledge Space” • Zoning • This process uses our (meta-) knowledge about information structure, and tags the relevant contexts of the documents or database records. • Identification of terms based on the terminology or on regular expression • Term Identification: identify the lexical items in a text, relate them to a term and retrieve the corresponding reference term via thesaurus relations. • Concept Identification: identify the concept related to the reference term(s). • Type Assignment: Assign the concept type related to a concept identifier • Extraction and normalization • Get list of rules to apply • Verifiers • Application of rules • Display
UltraLink Examples • Treatment of ambiguities • WILMs TUMOR • DISEASE Wilms' tumor => nephroblastoma • GENE NAME WT1 • TARGET Wilms' tumor
UltraLink Examples • Term extraction / Normalization -> Examples (mtor, mammalian target of rapamycin)
UltraLink Examples • Term to UltraLink: • granulocyte - macrophage colony stimulating factor • Concept Type: TARGET • Normalized term (non exhaustive): • Granulocyte-macrophage colony-stimulating factor • Synonyms: • colony stimulating factor 2 • Colony-stimulating factor, CSF ,GCSF, GM-CSF • Granulocyte macrophage colony stimulating factor • Molgramostin, Sargramostim • Local terms (non exhaustive list of examples): • EMBL e.g. AC004511, AF373868, … • Pubmed e.g. 1569568, 1737041, … • NCBI e.g. 10090, 10116, 9606 • GO e.g. GO:000512, GO:0019221, … • UniProt e.g. P01587, …
MetaCore Map containing FZD4 (frizzled 4) • Proteins where • antibodies are available • are marked with • an additional icon • Mouse-over • shows • specificity • Hyperlink • to • Antibodies • Web Report
GPS Lexical Analysis Server Tools Terminology Zoning DocStructures Lexical Extraction Tagging Meta-Rules The Ultralink can be call from the Internet Explorer Internet Explorer Integration GPS Add-in 1 User requests for analysis 4 Injection of specific HTML tags Web Page Tagged Document 3 Gets back tagged parts Web Service (WSDL) 2 Sends the document for analysis
Activation UltraLink Annotations on any web page
GPS Lexical Analysis Server Tools Terminology Zoning DocStructures Lexical Extraction Tagging Meta-Rules The Ultralink is integrated with Microsoft Office Microsoft Smart Tag Extraction 1 User requests for analysis Office Document 3 Gets back tagged parts Tagged Document Web Service (WSDL) 2 Sends the document for analysis
Acknowledgements Thérèse Vachon Martin Romacker Pierre Parisot Nicolas Grandjean Brigitte Charpiot Jean-Marc von Allmen Daniel Cronenberger Olivier Kreim
Knowledge Space and GPS Navigator • Backup slides
Literature Comp. Inf. Bioinformatics Biology Other Chemistry Internet ResearchDocumentation What constitutes the Knowledge Space Meta Data K map Defined workflows Ultralinker Text Mining Analytics SemanticSearch Thesaurii Ontologies Rules
Univariate - Companies Univariate - MOA Univariate - Diseases conditionned by Companies Clustering Diseases -MOAs Data Analysis –Protease modulators in CI DBs July 2004 - ADIS & Pharmaprojects
Graph Navigator – Protease modulators in CI DBs July 2004 - ADIS & Pharmaprojects