160 likes | 254 Views
Literature Mapping with PubAtlas -- extending PubMed with a `BLASTing interface’. D Stott Parker 1 , WW Chu 1 , FW Sabb 3 , AW Toga 2 , RM Bilder 3 1 UCLA Computer Science Dept, 2 Laboratory of Neuroimaging, 3 Dept of Psychiatry & Biobehavioral Sciences. Hypothesis Web Project
E N D
Literature Mapping with PubAtlas -- extending PubMed with a `BLASTing interface’ D Stott Parker1, WW Chu1, FW Sabb3, AW Toga2, RM Bilder3 1UCLA Computer Science Dept, 2Laboratory of Neuroimaging, 3Dept of Psychiatry & Biobehavioral Sciences Hypothesis Web Project NIH RL1LM009833
PubAtlas Literature Map PubAtlas is a “PubMed BLAST-query” service for two term sets/lexica result: contingency table for all queries (X AND Y) where X,Y are terms in the two lexica www.pubatlas.org
Lexica • PubAtlas Lexica: • term: definition pairs • Term Name : PubMed Query • optional hierarchical structure • Lexicon as: • concept base • ontology • user-defined term hierarchy (personalized MeSH hierarchy) • domain-specific query language
Concept BLASTing `Concept BLASTing’ seeks useful associations, much like microarray analysis Lexicon2 = Y hierarchy Lexicon1 = X hierarchy PubAtlas Literature Map: (X AND Y) association table MEDLINE / PubMed as a bioscience association base
Previous Work -- as an example Desirable extension features semi-automated generation of a review paper -- but thorough and remaining up-to-date graph: "PubMed" [TIAB] AND ("graph" [TIAB] OR "network" [TIAB] OR "diagram" [TIAB]) visual: "PubMed" [TIAB] AND ("visual" [TIAB] OR "visualizing" [TIAB] OR "visualization" [TIAB] …) friendly: "PubMed" [TIAB] AND ("friendly" [TIAB] OR "flexible" [TIAB]) better interface: "PubMed" [TIAB] AND ("interface" [TIAB] OR "interaction" [TIAB] OR "query" [TIAB]) …) exploration: "PubMed" [TIAB] AND ("exploration" [TIAB] OR "explore" [TIAB] OR "discovery" [TIAB] …) summarization: "PubMed" [TIAB] AND (summariz* [TIAB] OR digest* [TIAB]) map: "PubMed" [TIAB] AND ("mapping" [TIAB] OR "map" [TIAB] OR "mapped" [TIAB]) extraction: "PubMed" [TIAB] AND (extract* [TIAB] OR identif* [TIAB]) relevance: "PubMed" [TIAB] AND ("relevance" [TIAB] OR "ranking" [TIAB] OR "ordering" [TIAB]) powerful: "PubMed" [TIAB] AND ("powerful" [TIAB] OR "extended" [TIAB] OR "advanced" [TIAB]) AliBaba: "AliBaba" [TIAB] AND "PubMed" [TIAB] Anne O'Tate: "Anne O'Tate" [TIAB] BioIE: "BioIE" [TIAB] ClusterMed: "ClusterMed" [TIAB] ConceptLink: "ConceptLink" [TIAB] GoPubMed: "GoPubMed" [TIAB] HubMed: "HubMed" [TIAB] PubFocus: "PubFocus" [TIAB] PubGene: "PubGene" [TIAB] PubMatrix: "PubMatrix" [TIAB] PubMed Assistant: "PubMed Assistant" [TIAB] PubNet: "PubNet" [TIAB] PubReMiner: "PubReMiner" [TIAB] Relemed: "Relemed" [TIAB] SLIM: "Muin M" [au] AND "SLIM" [TIAB] VisualNet: "VisualNet" [TIAB] OR "Visual Net" [TIAB] XplorMed: "XplorMed" [TIAB] previous PubMed extensions
PubAtlas -- interesting aspects • PubAtlas as a tool for concept “BLASTing” • Moving towards shared, user-defined query/concept languages • Visual literature search with concept maps / literature maps • Building on familiar association mining metaphor • Extending PubMed with temporal indexing / concept evolution • Real uses: semi-automated reviews, knowledge mgmt, ... • Applications in Phenomics • Phenotypes are often naturally represented as queries • Promising applications in interdisciplinary collaboration
Knowledge Management Who at UCLA works on Dopamine Receptors? Many possibilities for interdisciplinary collaboration
People as Concepts Lori Altshuler: Altshuler Lori [FAU] OR Altshuler LL [AU] Stephen Marder: Marder Stephen [FAU] OR Marder SR [AU] Carrie Bearden: Bearden Carrie [FAU] OR Bearden CE [AU] Ty Cannon: Cannon Tyrone [FAU] OR Cannon TD [AU] Michael Phelps: Phelps Michael [FAU] OR Phelps ME [AU] John Mazziotta: Mazziotta John [FAU] OR Mazziotta J [AU] Paul Thompson: Thompson Paul M [FAU] OR Thompson PM [AU] Arthur Toga: Toga Arthur [FAU] OR Toga A [AU] Roger Woods: Woods Roger [FAU] OR Woods RP [AU] Bob Bilder: Bilder Robert [FAU] OR Bilder RM [AU] Nelson Freimer: Freimer Nelson [FAU] OR Freimer N [AU] . . . Map of publications in which people X, Y both occur as authors
Extending PubMed with Time Historical map of interdisciplinary collaboration at UCLA over 10 yrs 1998 2000 2002 2004 2006 2008
Deeper Exploration Visualization and interaction along with standard mining of association data
Larger Lexica For term sets of size M, N, PubAtlas submits M+N PubMed queries This can scale to hundreds or thousands of terms
Phenomic Vocabularies as Lexica Diverse, complex phenotypes can be represented as queries (predicates) -- denoting the set of all relevant documents PubMed / MEDLINE = central phenomics database
Query Expansion -- for Phenotypes "nback" … N-back ("nback" OR “n-back” OR "wisconsin card sorting" OR "sternberg" OR "working memory capacity" OR "stroop" OR "choice reaction time" OR "paced auditory serial addition" OR "pasat" OR "digit span" OR "delayed match to sample") paced auditory serial addition Wisconsin card sorting choice reaction time Sternberg Stroop Queries (like “n-back test”) can be expanded with terms related to their target concept (like working memory), using statistical models to identify better expansions. Expansion can improve precision and recall of queries that are being used as models of concepts/phenotypes
Summary • PubAtlas as a tool for concept “BLASTing” • Lexica are concept bases / user-defined query languages • PubAtlas constructs concept maps / literature maps • Extends PubMed with temporal indexing • Multiple features for exploration, visualization • Real uses: semi-automated reviews, who is doing what, ... • Many interesting directions for further work • Applications in Phenomics • Phenotypes are often naturally represented as queries • Promising applications in interdisciplinary collaboration