Knowledge Engineering

Knowledge Engineering Start with the question: “What is an ‘atom’ of scientific knowledge?”

Scientific assertions as ‘Computable, citable elements’ • There are very large number of statements like ‘mice like cheese’ • semantics at this level are complicated! • For example: • “Novel neurotrophic factor CDNF protects midbrain dopamine neurons in vivo” [Lindholm et al 2007] • “Hippocampo-hypothalamic connections: origin in subicular cortex, not ammon's horn.” [Swanson & Cowan 1975] • “Intravenous 2-deoxy-D-glucose injection rapidly elevates levels of the phosphorylated forms of p44/42 mitogen-activated protein kinases (extracellularly regulated kinases 1/2) in rat hypothalamic parvicellular paraventricular neurons.” [Khan & Watts 2004] • Assertions vary in their levels of reliability, specificity. • Can we introduce a generalized formalism that could support automated reasoning?

Cycles of Scientific Investigation (‘CoSI’)

e.g., ‘CDNF protects nigral dopaminergic neurons in-vivo’ This statistically-significant effect is the experimental basis for the findings of this study. Our ontology engineering approach is based on experimental variables from Lindholm, P. et al. (2007), Nature, 448(7149): p. 73-7

Knowledge Engineering from Experimental Design (‘KEfED’) Khan et al. (2007), J. Neurosci. 27:7344-60 [expt 2]

KNOWLEDGE ENGINEERING FROM EXPERIMENTAL DESIGN‘KEfED’ Project Overview

Project History The KEfED formalism has been under formulation since 2006 and received it’s first active funding in 2007. It has been initially developed in a demonstration project based on neural connectivity and has been developed for the Michael J Fox and Kinetics Foundations for Parkinson’s research. The initial user group consists of laboratory-based neuroanatomists and neuroendocrinologists. Early phases of the project involved development of initial prototypes to capture the design of a well-understood experimental design and to generate a knowledge base for experimental data from that design. We have developed numerous prototypes but have deployed a working system from the http://www.bioscholar.org website in March, 2011. Ongoing enhancements to the system include (a) ontology support, (b) the representation of statistical relations and correlations, (c) coordination with the data management and information integration working groups.

BioScholarApplication • Develop a knowledge base framework for observations and interpretations from experiments. • Scientists manually curate data by hand from publications into generic database driven by KEfED model • Can reuse designs for multiple experiments • Design process is intuitive, can build a database without informatics training • Ideal for non-computational biologists. • Java / Flex Web application, one click install Use Cases Scientists want to develop a generic knowledge base driven from a corpus of PDF files stored locally within a specific laboratory

CruxApplication • Scientists within a disease foundation must plot a whole research program • How to keep track of hypotheses, experimental results and outcomes to plan the next phase of the project? • System is just about to start year 2 of funding geared towards curation of raw data (not from publications). • Possible framework to help scientists develop simple databases. Use Cases Decision makers at a disease foundation want to store raw data generated a generic knowledge base driven from a corpus of PDF files stored locally within a specific laboratory

Screenshots

Knowledge Engineering