250 likes | 450 Views
Biological Storytelling : A Software Tool for Biological Information Organization Based upon Narrative Structure. Allan Kuchinsky, Kathy Graham, David Moh, Annette Adler, Ketan Babaria, Michael L. Creech Agilent Technologies 3500 Deer Creek Road Palo Alto, California, USA
E N D
Biological Storytelling: A Software Tool for Biological Information Organization Based upon Narrative Structure Allan Kuchinsky, Kathy Graham, David Moh, Annette Adler, Ketan Babaria, Michael L. Creech Agilent Technologies 3500 Deer Creek Road Palo Alto, California, USA allan_kuchinsky@agilent.com
Understanding the Molecular Basis of Disease Source: Weewaratna et al, “Wnt5a Signaling Directly Affects Cell Motility and Invasion of Metastatic Melanoma”, Cancer Cell, April 2002
High-ThroughputExperimental Methods DNA microarray technology enables biologists to simultaneously study an entire set of cellular processes at a molecular level(per experiment … ) Source: Stephanie Fulmer-Smentek, Robert Kincaid: Agilent Technologies
Biological Storytelling • Reason across multiple types of experimental data • Formulate hypotheses and high-level descriptions
Supporting Synthesis Tasks of Biomedical researchers • Most bioinformatics tools support analysis, but not synthesis activities of bioscience researchers • Synthesis involves: • keeping track of the diverse pieces of information collected during database searches and other data analysis activities, and • organizing and using the diverse information, formulating hypotheses and higher-level explanations • E.g. elucidating the structure and function of biological pathways. • validating hypotheses and higher-level explanations against detailed experimental data • sharing the information with colleagues and working collaboratively to refine hypotheses.
Aspects of Synthesis workFindings from User Research • Like detective work, “mind-mapping” • Information is “free-form”, semi-structured • Multiple hypothesis/alternatives • Group work, multiple perspectives, views • Grouping together chunks of related information • Verbal and visual reasoning • Notepads, whiteboards, lab notebooks
The Role of Narrative • Many biologists talked in terms of "piecing together a story“… • of what a gene does, how it fits into a pathway with other genes, proteins… • Cognitive/social psychology research finds that people use storystructure as a way of organizing and remembering information • Thorndyke, Shank, Middleton and Edwards, Erickson, Gershon and Ward • Useful story development software tools exist for other domains • E.g. screenwriting, video production
Main concepts of our software • Free-form data model • Narrative Structure • “Top down” hypothesis formulation, “bottom-up” data exploration • Annotation • Multi-disciplinary Collaboration
ITEMS • Basic “atomic” unit of information • Represent “biological entities” – genes, proteins, … • Sortable on multiple keys • Group selectable • Populate manually or semi-automatically • Links to detailed experimental data • Links to public data and literature • Data values can be color-encoded
COLLECTIONS • Free-form sets of items • Malleable • split, merge, add, move • Represent cognitive chunks • Can be nested • Populate manually or semi-automatically • Links to detailed experimental data • Links to public data and literature
STORIES • Represent state of biological hypotheses and understandings • Represent paths explored and alternative hypotheses • Support for deliberation via Support/Oppose story nodes • Links to detailed experimental data • Links to public data and literature • Narrative structure as an organizing principle
Story Grammar Source: Thorndyke, P.W. (1977), "Cognitive Structures in Comprehension and Memory of Narrative Discourses", Cognitive Psychology, 9, pp. 77-110
PUTTING THE STORY TOGETHER GRAPHICALLY • Graphical network diagram is common visual metaphor in molecular biology • Pathways, protein/protein interaction networks • Represent the “nouns” and “verbs” of biological stories • Nouns = biological entities (genes, proteins), (players), • verbs = relationships between biological entities (promotes, inhibits)
SEMANTIC OVERLAYS • For validating high-level explanations, hypotheses against data • Juxtapose data values onto elements of graphical and textual stories • “step through” experimental data columns and “light up” elements of graphical and textual stories • Analogy to qualitative simulation
Support for Multidisciplinary Collaboration • Annotation is tagged with user name and timestamp • Support and Oppose story nodes document alternative lines of thought • Web repository enables review by non-users of system “… we build up a consensus hallucination about what is going on in the living cell… (NHGRI)”
Annotation • Every system element can have arbitrary textual notes • Citations can be dragged/dropped from the Web-based literature • Citations can have attached notes/comments • Annotations interlinked with system elements, other annotations
Usage Feedback • How much structure is just right? • Flexibility in story grammar (tags vs. rigid structure) • Diversity in grouping (columns as well as rows) • Diagrammatic vocabulary • Biochemical reactions vs. signal transduction
Related Work • Digital storytelling • Storyspace (EastGate Systems) • Personal information management • Lotus Agenda • Issue-Based Information Systems • Rittel, McCall, Conklin & Begeman • Biological Pathway Databases • STKE, BIND, KEGG, EcoCyc, TransPath, SPAD, … • Semantic overlays • EcoCyc, GenMAP (gene expression on pathways) • Kenna Technologies (qualitative simulation) • Diagrammatic UIs to cell and pathway information • CellSpace (Cellomics) • Biological information management • eLabBook (LabBook, Inc.)
Future work • Multiple data types • Support a “systems” perspective on biology • Utilization of data mining, computational tools • Duality between graphical and textual storytelling • Richer annotation • Richer diagrammatic semantics • Scaling
Acknowledgements • Agilent Technologies • Dean Thompson, Deborah Hall, Laurakay Bruhn, Steve Laderman, Steve Andrews, Shawn Hwang, Carl Steves, and Alex Veilleux, Robert Kincaid, Aditya Vailaya • National Human Genome Research Institute • Paul Meltzer, Mike Bittner, Yidong Chen, and Jeff Trent • Formative discussions • Bob Allen (UMD) • Abbe Don
… more info … Allan Kuchinsky Agilent Technologies 3500 Deer Creek Road Palo Alto, California, USA allan_kuchinsky@agilent.com +1 (650) 485-7423 http://www.agilent.com
Gene GCA CAG GGC CGT GTC CCG Arg - Val - Pro Protein Proteins perform cellular functions Just Enough Molecular Biology Chromosome What the cell could Theoretically do DNA DNA sequence (genes) determine which proteins are made transcription What the cell is Trying to do GCA CAG GGC RNA mRNA quantity determines amount of each protein translation What the cell Is doing Regulation