1 / 31

Chemical name interpretations & Molecular time lines -

Chemical name interpretations & Molecular time lines -. This shows detailed record view – with molecular links -. This shows the chemicals report with molecular timeline & mouse over of chemical names. Exploring co-table analysis of Molecules with Gene ID ’ s.

triage
Download Presentation

Chemical name interpretations & Molecular time lines -

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chemical name interpretations & Molecular time lines -

  2. This shows detailed record view – with molecular links -

  3. This shows the chemicals report with molecular timeline & mouse over of chemical names

  4. Exploring co-table analysis of Molecules with Gene ID’s For example – show me all of the co-occurrences of these (x) molecules with these (any / all) gene’s !

  5. 1 From the main menu select the Analyze tab

  6. From the analyze menu select the Cotable tab ! 2

  7. Now Enter the Inchi keys for the molecules of interest - 3 Click here to enter a sample (test) set of molecules

  8. Now select - patent field – to explore “patents” ! 4 These are the molecules of interest – (Inchi keys to explore) Select Patent field here

  9. Now select - facet = patent field + Gene then click analyze 5 Molecules Facet = Patents + Genes

  10. This shows the “cotable” results = co-occurrences of molecules + NCBI –Gene ID’s These are the NCBI Gene ID #’s To transpose the charts or export the data – click here

  11. This shows the transposed chart – of co-occurrences of molecules + NCBI –Gene ID’s Click here to see the patents containing this molecule + this particular gene

  12. Co-table Analysis For example : Show me all documents where imitrex was Mentioned with “any” …..sign and / or symptoms (note: these are terms such as headache, vomiting, nausea ..etc ..there are > 680 of them).

  13. Draw a compound of interest 1 2 Click – view compound in co-table

  14. Draw a compound of interest 1 2 Click – view compound in co-table

  15. 3 Select a MeSH category for Co-occurance analysis 4 Click analyze

  16. This shows the number of documents that contained the source molecule and ANY of the MeSH – C23 terms Click on the numbers to “link to ” the documents

  17. Type in a new MeSH code to change the analysis from ‘signs & symptoms’ (C23) to diseases (C01)

  18. This shows the number of documents that contained the source molecule and ANY of the MeSH – disease (C01) terms

  19. This shows the comparison of 2 drugs and the co-occurrence of MeSH Symptoms (C23) terms

  20. This shows the comparison of different statins and the co-occurrence of MeSh terms Chemical Structures vs. Signs and Symptoms Medline co-occurrence of Statin structures vs. MeSH –

  21. Screen shoots from our SIMPLE / SIIP Web application

  22. Chemical Search using ChemAxon w/ DB2 Search Proximal Search Nearest Neighbor Search

  23. Clustering Claims Originality BioTerm Analysis Discovery

  24. Landscape Analysis Visualization Networks

  25. IBM’s - Massively Parallel Probabilistic Architecture Question Synthesis Final Merging & Ranking Question/Topic Analysis Hypothesis & Evidence Scoring Query Decomposition Hypothesis Generation Trained Models Soft Filtering Hypothesis & Evidence Scoring Hypothesis Generation Hypothesis & Evidence Scoring Hypothesis Generation Soft Filtering Answer, Confidence Watson generates and scores many hypotheses using an extensible collection of Natural Language Processing, Machine Learning and Reasoning Algorithms. Thesegather and weigh evidence over both unstructured and structured content to determine the answer with the best confidence. E. Sources A. Sources Deep Evidence Scoring Answer Scoring Supporting Evidence Retrieval Primary Search Candidate Answer Generation Evidence Retrieval Deep Evidence Scoring 25 Source – J Kreulen

  26. Technical Issues to consider when applying QA systems like Watson Nature of Domain: Open vs. ClosedClosed domain implies all knowledge is contained within a specific domain characterized by ontologies and there is no need to go outside the domain.Jeopardy is an open-domain example where it is general knowledge. Knowledge/Data Sources: AvailabilityQA systems are natural language search engines. Watson goes beyond NL search. If knowledge sources are incomplete, unavailable, insufficient or inadequate then it is not possible for the system to provide an answer. In some cases one would need to envisage Interactive QA that require human interaction to guide the search. Another very important consideration is the availability of sufficient sample data for training (i.e. training corpus). Need for multi-modalityIs there a need for Transcription from Speech to Text before a question is answered? This would require integration of Speech to Text capabilities that are not really ready for real-time applications. Latency Watson is capable of processing 500GB of information per second with 3 sec response to questions and used most of its knowledge source in memory (as opposed to disk) for speed. What is the latency requirement for the application? Multi-Lingual or Cross-Lingual Support Watson can support only English at this time; with language-specific parsers other languages can be supported . If knowledge sources or QA is required in multiple languages then that would not be a good candidate. Additionally if cultural context have to be accommodated in the answer then it would not be prudent to deploy QA systems directly interacting with users. Question Type Decomposition and classification of the question is critical to how QA systems work. Bulk of the question types in Jeopardy were Factoid questions. Watson did not include 2 question categories: One is Audio/Video type questions that require looking at a video to answer and another are questions that require special instructions (e.g. verbal instructions to explain a question.) Answer Types Watson is not designed to curate a task-oriented system. It can handle temporal and geo-spatial reasoning in its answers. As it stands it cannot handle business process type of reasoning (to do task B tasks A, C must be completed etc.) DeepQA Application (Java/C++) Apace Hadoop + Apache UIMA SUSE Linux Enterprise Server 11 Watson Infrastructure • 90 Power 750 Servers • Each Server 3.5GHz POWER7 8 Core Processor with 4 threads/core • Total: 2880 POWER7 Cores with 16TB RAM • Processing speed: 500Gb/sec; 80 TeraFLOPS • 94th on Top 500 Supercomputers • Note: This hardware is for Jeopardy. Any other application of Watson will require appropriate sizing and optimization for purpose.

  27. I would like to acknowledge the IBM Almaden Research – team Jeff Kreulen Ying Chen Scott Spangler Alfredo Alba Tom Griffin Eric Louie Su Yan Issic Cheng Prasad Ramachandran Bin He Ana Lelescu Qi He Linda Kato Ana Lelescu Brad Wade John Colino Meenakshi Nagarajan Timothy J Bethea German Attanasio Laura Anderson Robert Prill + a host of folks from IBM China Labs -

  28. Back-up slides

  29. Challenges ahead – • Access to full – text • Language issues • Chinese • Japanese • Korean • Other • Legal issues • Web data • Integration with Medical content

  30. Attempts to process Chinese Patent Documents Extracting chemical structures form Chinese patents… Chemicals from Chinese Patents -

  31. Computer Curation Process Overview & integration with our collaborators - Services Hosted at IBM Almaden User Applications Annotation Factory ChemVerse Selected Internet Content Knime or Pipeline Pilot U.S. Patents (1976 -—2009) ChemVerse db (Semantic Associations) e Classifier & Other Data Associations View selected Documents & Reports BIW U.S. Pre- Grants (All) ADU* Database + compu ted Meta Data IP Database (e.g. DB2) Data Sources Parse & Extract data PCT & EPO Apps Cognos/DDQB/ Other Apps Medline Abstracts (>18 M) In-House Content Computational Analytics Annotator 1 Chem Axon Search Annotator 2 SIMPLE * ADU = Automated Data Update

More Related