1 / 11

Mining the Web: Discovering New Biomedical Knowledge

Mining the Web: Discovering New Biomedical Knowledge. Aly Khan. The Human Genome Project. Goal: Sequence the human DNA Completed in 2003 Joint effort between National Institutes of Health and Celera Genomics. ~25,000 genes. 25,000 Genes. What do they do? How do they interact?.

dasan
Download Presentation

Mining the Web: Discovering New Biomedical Knowledge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining the Web: Discovering New Biomedical Knowledge Aly Khan

  2. The Human Genome Project • Goal: Sequence the human DNA • Completed in 2003 • Joint effort between National Institutes of Health and Celera Genomics. • ~25,000 genes

  3. 25,000 Genes • What do they do? • How do they interact?

  4. Finding context • Use vast amounts of published works to find novel relationships between genes • 17,000,000 records from more than 5,000 biomedical journals

  5. On searching • Biomedical literature unbounded • Unstructured text in biomedical publications

  6. Example record

  7. XML record

  8. Applications • NLP Parse text for matches using POS tags: • [Query noun phrase term] “is a” [noun phrase class] • hiv is a virus • [Noun phrase class] “is a” [Query noun phrase term] • genes such as 4fgf

  9. Applications “The results demonstrated that KaiC interacts rhythmically with KaiA, KaiB, and SasA.” Ozgur et al. Path1: KaiC – nsubj – interacts – obj – SasA Path2: KaiC – nsubj – interacts – obj – SasA – conj_and – KaiA Path3: KaiC – nsubj – interacts – obj - SasA – conj_and – KaiB Path4: SasA – conj_and – KaiA Path5: SasA – conj_and – KaiB Path6: KaiA - prep_with - SasA – conj_and – KaiB

  10. Contextual representation • PTEN is transcriptionally regulated by transcription factors such as p53 and Egr-1. • In response to DNA damage, the cell-cycle checkpoint kinase CHEK2 can be activated by ATM kinase to phosphorylate p53 and BRCA1, which are involved in cell-cycle control and apoptosis.

  11. Goals • Creating a global ontology for genes, diseases, etc. • Automated discovery of relationships.

More Related