190 likes | 276 Views
Topic Mapping Tools for Biomedical Corpora. Gully APC Burns, USC/ISI Dave Newman, UC Irvine Bruce Herr, IU. ‘Snapshots of Neuroscience’. Society for Neuroscience Annual meeting (2000 New Orleans) ~30,000 attendees, ~12,000 posters per year. Basic Idea: Topic Modeling.
E N D
Topic Mapping Tools for Biomedical Corpora Gully APC Burns, USC/ISI Dave Newman, UC Irvine Bruce Herr, IU
‘Snapshots of Neuroscience’ Society for Neuroscience Annual meeting (2000 New Orleans)~30,000 attendees, ~12,000 posters per year
Basic Idea: Topic Modeling Erythropoietin (Epo), a hematopoietic cytokine, has recently been demonstrated to provide neuroprotection on nigral dopaminergic neurons. However, there is no information available about whether Epo can protect dopaminergic neurons from the neurotoxicity of 6-hydroxydopamine (6-OHDA) that is most commonly used to create a rat model of Parkinson’s disease (PD). In the present study, we tested the hypothesis that recombinant human Epo (rhEpo) would protect dopaminergic neurons and improve neurobehavioral outcomes in a rat model of progressive PD. rhEpo (20 units in 2μl of vehicle) was stereotaxically injected into one side of the striatum. The 6-OHDA lesion was made into the same side one day after rhEpo treatment. Methamphetamine-induced rotation was measured 3 and 10 weeks after the lesion, and paw reaching was also tested at 10 weeks. After the last time of behavioral test, rats were then sacrificed, and the brains were perfusion-fixed for histology and immunocytochemistry. We observed that intrastriatal administration of rhEpo significantly reduced the degree of rotational asymmetries. The rhEpo-treated animals also showed a better improvement in skilled forelimb use when compared with the control rats.In accompanying with the recovery of neurobehavioral outcomes, tyrosine hydroxylase (TH)-immunoreactive neurons of the substantia nigra were protected from progressive degeneration in the rhEpo-treated rats. TH-immunoreactivity in the 6-OHDA lesioned striatum also significantly increased in the rhEpo-treated rats. To examine if systemic administration of rhEpo could exert the similar biological effects …
Basic Idea: Topic Modeling ... plus all remaining ‘topic mass’ – provides a signature from which we can calculate document-document similarities (~12,000 x ~12,000 matrix)
‘Topic Mapping’ Workflow Topic Model Literature Corpus Google Maps Application Document-Document Similarity Map ischemiacerebralischemicstrokebrainocclusioninjuryinfarctmcaohourreperfusionarteryvolumemodelmiddletransient Graph Layout Processing with VxOrd / DrL Multi-level image rendering, Cluster analysis for label placement Topic Modeling using Gibbs Sampling
Implementation 1: SfN 2006 Maps @ SfN 2007 Analysis: Dave Newman, UCI Visualization: Bruce Herr, IU
Lessons Learned This demonstration had a high impact at SfN 2007 [Shown to Neuroinformatics Committee (NIC), PubMed Plus Panel, Program Committee, General Council] Why? • System emphasizes elegant visualization • Application has natural, familiar, intuitive design • Criticisms centered on concerns about analysis validity (‘what do clusters actually mean’?) ...but, system focused on utility, not interpretations...
Next Steps Gary Westbrook [NIC, ex-editor of J Neurosci, external committee of National Institute of Neurological Disorders and Stroke, NINDS] Edmund Talley [Program Director NINDS, Channels Synapses and Circuits] Requested a system to examine NINDS grants accessed from CRISP
CRISP: Computer Retrieval of Information on Scientific Projects Lists all funded DHHS projects from 1972 [including data from NIH, CDC, FDA, HRSA and AHRQ] Build topic map of NINDS 2006 grants in relation to 13 other NIH institutes involved with funding Neuroscience research. [Largest Institute: NCI ~ 9373 grants (2006)] [Smallest Institute: NIAAA ~ 1198 grants (2006)] Downloaded 10 years of abstracts from NINDS (to weight distribution in favor of NINDS topics) and 1 year of all other 13 institutes. NINDS staff hand-annotated ~2500 grants with SfN categories (theme, sub-theme, topic) to compare with categories generated by the topic model.
Additional Features for this implementation • Improved navigability • Multiple maps • Multiple labeling / coloring schemes • Search • Google Map – based flags, etc. • full-text search within the HTML application
What’s Next? All 2007 abstracts from NIH (all institutes) Diagnostic functions within browser - ‘Heat maps’ of each individual topic - ‘Cluster Expansion’ Trend analysis Which topics are emergent? Which are in decline? Can we perform analysis across corpora? SfN abstracts from 2001-2008 Medline (>8 million abstracts) CRISP (funded federal project abstracts) PubMed Central (~1 million full text papers) Other full-text resources
What’s Next? All 2007 abstracts from NIH (all institutes) Diagnostic functions within browser - ‘Heat maps’ of each individual topic - ‘Cluster Expansion’ Trend analysis Which topics are emergent? Which are in decline? Can we perform analysis across corpora? SfN abstracts from 2001-2008 Medline (>8 million abstracts) CRISP (funded federal project abstracts) PubMed Central (~1 million full text papers) Other full-text resources
Data across many years allows trend analysis PD HIV p53 Medline Data
What’s Next? All 2007 abstracts from NIH (all institutes) Diagnostic functions within browser - ‘Heat maps’ of each individual topic - ‘Cluster Expansion’ Trend analysis Which topics are emergent? Which are in decline? Can we perform analysis across corpora? SfN abstracts from 2001-2008 Medline (>8 million abstracts) CRISP (funded federal project abstracts) PubMed Central (~1 million full text papers) Other full-text resources
Funding Information Sciences Institute, seed funding NSF: IIS-0513650 NINDS contracts (Ned Talley) Collaborators Dave Newman (UCI) Bruce Herr (IU) Developers Tommy Ingulfsen Contributing Computer Scientists Padhraic Smyth (UCI) Katy Borner (IU) Patrick Pantel (ISI/Yahoo!) Acknowledgements