60 likes | 72 Views
This article explores how biomedical knowledge becomes fragmented within disciplines and how integration across disciplines occurs. It discusses the concept of non-interactive literatures, complementary knowledge, undiscovered public knowledge, and the development of hypothesis-generation and discovery support systems. The Arrowsmith tool is introduced as a multi-step, multi-tool process for citation acquisition and relationship mining. The goal is to generate testable hypotheses and facilitate the integration of biomedical knowledge.
E N D
Swanson & Smalheiser’s Arrowsmith and the Fragmentation of Biomedical Knowledge John MacMullen SILS Bioinformatics Journal Club Fall 2002
How do literatures become fragmented? • By discipline…Advances in Biophysics; Astrophysics; Journal of Mathematical Physics • By sub-discipline…Aquatic Toxicology; Proteomics; Molecular Immunology • By professional specialty…Annals of Oncology; Biological Research for Nursing • By theory / practice…Experimental Gerontology; Veterinary Dermatology • By language…Zeitschrift für Pflanzenernährung und Bodenkunde • By topic (e.g., disease)…Cancer; Molecular Carcinogenesis; Multiple Sclerosis; Pain • By physical subject…Journal of Fish Diseases; Yeast • By structure…Blood; Cell; Lipids; Nucleic Acids Research • By function…Apoptosis; Journal of Molecular Catalysis; Traffic - The International Journal of Intracellular Transport • By technique…Electrophoresis; International Journal of Mass Spectrometry; Ultramicroscopy SILS Bioinformatics Journal Club
How do literatures become integrated? • By reviews…Annual Review of Genomics and Human Genetics • By special issues…Nucleic Acids Research annual database issue; JASIS&T bioinformatics issue • By sub-discipline…Bioinformatics; Computers & Chemistry • By citations… …within and across domains • By search tools… Indexes, databases, search engines, etc. • By discovery support systems…? SILS Bioinformatics Journal Club
Integration Timeline • 1936 H.G. Wells recognizes significance of fragmentation of knowledge across disciplines; conceives “World Encyclopedia” • mid-1960s Herbert Bohnert & Manfred Kochen explore “World Encyclopedia” concept in light of growth of computers; Eugene Garfield develops Science Citation Index and ISI to enable “information discovery & information recovery” • mid-1970sManfred Kochen revisits “World Encyclopedia” concept in book Integrative Mechanisms in Literature Growth • 1980s – 1990s Don Swanson & Neil Smalheiser explore “complementary but mutually disjoint, non- interactive literatures” and find “undiscovered public knowledge” via Arrowsmith • 1990s – present Claire Beghtol, Roy Davies, Susan Dumais, Sherilynn Fuller et al, Michael Gordon, Mark Spasser, Marc Weeber, et al. pursue Swanson’s ideas SILS Bioinformatics Journal Club
Swanson’s premises • Specialization causes knowledge to be fragmented into non-interactive (mutually disjoint) literatures • Some non-interactive literatures are complementary • Two non-relevant things may become relevant when joined • Implicit vs explicit linkages • Undiscovered public knowledge • Toward ‘hypothesis-generation-’ / ‘discovery support systems’ SILS Bioinformatics Journal Club
Arrowsmith http://arrowsmith.psych.uic.edu • Multi-step, multi-tool process • Procedure 1: Citation acquisition • Search MEDLINE for topical cites (‘C’ list) • Apply stopword list and extract unique terms (‘B’ list) • Search MEDLINE for ‘B’ term cites; prune list • Perform MEDLINE searches for each ‘B’ term • Classify results into likely categories • Derive the intersection of each ‘B’ set with the restriction set, and the union of intersection sets (‘U’) • Search the resulting terms of ‘U’ set in MEDLINE • ‘U’ list becomes potential ‘A’ terms, with each ‘A’ term attached to the ‘B’ term that generated it • Rank ‘A’ term results against ‘B’ co-occurence • Procedure 2: Relationship Mining • Search for pre-existing A→C &/or A→B→C relationships • Search for novel A→C relationships • Output: Display of ‘A’ & ‘C’ cites by their common ‘B’ terms • Goal: a plausible testable hypothesis • Human relevance judgments in each step influence future steps SILS Bioinformatics Journal Club