440 likes | 582 Views
BioSigNet: Reasoning and Hypothesizing about Signaling Networks. Nam Tran. Main points. Biomedical databases: structured data and queries. http://cbio.mskcc.org/prl/ Next step: knowledge bases and reasoning. Kinds of reasoning, incomplete knowledge
E N D
BioSigNet: Reasoning and Hypothesizing about Signaling Networks Nam Tran
Main points • Biomedical databases: structured data and queries. http://cbio.mskcc.org/prl/ • Next step: knowledge bases and reasoning. • Kinds of reasoning, incomplete knowledge • How can existing knowledge be revised, expanded? • Hypothesis formation • Experimental verifications
Knowledge based reasoning • Various kinds of reasoning • Prediction – side effects • Planning – designing therapies • Explanation – reasoning about unobserved aspects • Consistency checking – correctness of ontologies • Additional facets/nuances • Reasoning with incomplete knowledge. • Reasoning with defaults. • Ease of updating knowledge (elaboration tolerance)
Hypothesis formation • If: • our observations can not be explained by our existing knowledge? • or the explanations given by our existing knowledge are invalidated by experiments? • Then: Our knowledge needs to be augmented or revised? • How? • Can we use a reasoning system to predict some hypothesis that one can verify through experimentation?
UV leads_to cancer High UV (K,I) |= O p53 Cancer No cancer Knowledge base Hypothesis space
Motivation -- summary • Goal: To emulate the abstract reasoning done by biologists, medical researchers, and pharmacology researchers. • Types of reasoning: prediction, explanation and planning. • Current system biology approaches: mostly prediction. • Incomplete knowledge constantly needs to be updated -> Hypothesis formation
Overview of our approach • Represent signal network as a knowledge base that describes • actions/events (biological interactions, processes). • effect of these actions/events. • triggering conditions of the actions/events. • To query using the knowledge base: • Prediction; explanation; planning. • Hypothesizing to discover new knowledge • BioSigNet-RRH: Biological Signal Network – Representation, Reasoning and Hypothesizing
Foundation behind our approach • Research on representing and reasoning about dynamic systems (space shuttles, mobile robots, software agents) • causal relations between properties of the world • effects of actions (when can they be executed) • goal specification • action-plans • Research on knowledge representation, reasoning and declarative problem solving – the AnsProlog language.
Representing signal networks as a Knowledge Base • Alphabet: • Actions/Events: bind(ligand,receptor) • Fluents: high(ligand), high(receptor) • Statements: • Effect axioms: bind(ligand,receptor) causes bound(ligand,receptor) if con. high(other_ligand) inhibits bind(lig,receptor) if cond. • Trigger conditions: high(ligand), high(receptor) triggers bind(ligand,receptor)
Initial observations, Queries, Entailment • Entailment: (K,I) |= Q • Given • K: the knowledge base of binding • I: initially high(ligand), high(receptor) Conclude • Q = eventually bound(ligand,receptor) • Given • K: the knowledge base of binding • I’: initially high(ligand), high(receptor), high(other_ligand) Conclude Q
Importance of a formal semantics • Besides defining prediction, explanation and planning, it is also useful in identifying: • Under what restrictions the answer given by a given algorithm will be correct. (soundness!) • Under what restrictions a given algorithm will find a correct answer if one exists. (completeness!)
bind(TNF-,TNFR1) causes trimerized(TNFR1) • trimerized(TNFR1) triggers bind(TNFR1,TRADD)
Prediction • Given some initial conditions and observations, to predict how the world would evolve or predict the outcome of (hypothetical) interventions.
Initial Condition • bind(TNF-α,TNF-R1) occurs at 0 • Query • predicteventually apoptosis • Answer: Unknown! • Incomplete knowledge about the TRADD’s bindings. • Depends on if bind(TRADD, RIP) happened or not!
Initial Condition • bind(TNF-α,TNF-R1) occurs at t0 • Observation • TRADD’s binding with TRAF2, FADD, RIP • Query • predicteventually apoptosis • Answer: Yes!
Explanation • Given initial condition and observations, to explain why final outcome does not match expectation. • Relation to diagnosis.
Initial condition: • bound(TNF-,TNFR1) at t0 • Observation: • bound(TRADD, TRAF2) at t1 • Query: Explain apoptosis • One explanation: • Binding of TRADD with RIP • Binding of TRADD with FADD
Planning • Given initial conditions, to plan interventions to achieve a goal. • Application in drug and therapy design.
Planning requirements • In addition to the knowledge about the pathway we need additional information about possible interventions such as: • What proteins can be introduced • What mutations can be forced.
Planning example • Defining possible interventions: • intervention intro(DN-TRAF2) • intro(DN-TRAF2) causes present(DN-TRAF2) • present(DN-TRAF2) inhibits bind(TRAF2,TRADD) • present(DN-TRAF2) inhibits interact(TRAF2,NIK) • Initial condition: • bound(NFκB,IκB) at 0 • bind(TNF-α,TNF-R1) at 0 • Goal: to keep NFκB remain inactive. • Query: • plan always bound(NFκB,IκB) from 0
Future Works! • Further development of the language • To better approximate cellular systems • Delay triggers • Granularities of representation • Continuous processes, hybrid systems • Concurrency, durative actions • Scaled-up implementation • Kohn’s map • Networks in Reactome and other repositories • Ontologies • Integration with BioPax
UV leads_to cancer High UV (K,I) |= O p53 Cancer No cancer Knowledge base Hypothesis space
Issues in this tiny example • Hypothesis formation: Theory: UV leads to cancer. Observation: wild-type p53 resists the UV effect. Hypothesis: p53 is a tumor-suppressor. • Elaboration tolerance: How do we update/revise “UV leads to cancer”? • Defaults and non-monotonic reasoning: NormallyUV leads to cancer. UV does not lead to cancer if p53 is present.
Construction of hypothesis space • Present: manual construction, using research literature • Future: integration of multiple data sources • Protein interactions • Pathway databases • Biological ontologies …….. Provide cues, hunches such as A may interact with B: action interact(A,B) A-B interaction may have effect C: interact(A,B)causesC
Generation of hypotheses • Enumeration of hypotheses • Search: computing with Smodels (an implementation of AnsProlog) • Heuristics • A trigger statement is selected only if it is the only cause of some action occurrence that is needed to explain the novel observations. • An inhibition statement is selected only if it is the only blocker of some triggered action at some time. • Maximizing preferences of selected statements
Generation … (cont’): heuristics • Knowledge base K • a causes g • b causes g • Initial condition I = { intially f } • Observation O = { eventually g } • (K,I) does not entail O • Hypothesis space: to expand K with rules among • f triggers a • f triggers b • Hypotheses: { f triggers a }, or { f triggers b }
Tumor suppression by p53 • p53 has 3 main functional domains • N terminal transactivator domain • Central DNA-binding domain • C terminal domain that recognizes DNA damage • Appropriate binding of N terminal activates pathways that lead to protection of cell from cancer. • Inappropriate binding (say to Mdm2) inhibits p53 induced tumor suppression.
p53 knowledge base • Stress • high(UV ) triggers upregulate(mRNA(p53)) • Upregulation of p53 • upregulate(mRNA(p53)) causes high(mRNA(p53)) • high(mRNA(p53)) triggers translate(p53) • translate(p53) causes high(p53)
p53 knowledge base (cont.) • Tumor suppression by p53 • high(p53) inhibitsgrowth(tumor)
p53 knowledge base (cont’) • Interaction between Mdm2 and p53 • high(p53), high(mdm2) triggers bind(p53,mdm2) • bind(p53,mdm2) causes bound(dom(p53,N)) • bind(p53,mdm2) causes high([p53 : mdm2]), • bind(p53,mdm2) causes ¬high(p53),¬high(mdm2)
Hypothesis formation • Experimental observation: • I = { initially high(UV), high(mdm2), high(ARF) } • O = { eventually ~ tumorous } • (K,I) does not entail O • Need to hypothesize the role of ARF.
Constructing hypothesis space • Levels of ARF and p53 correlate • high(ARF) triggers upregulate(mRNA(p53)) • high(p53) triggers upregulate(mRNA(ARF))
Constructing …(cont’) • Interactions of ARF with the known proteins • bind(p53,ARF) causes bound(dom(p53,N))
Constructing …(cont’) • Influence of X (=ARF) on other interactions • high(ARF) triggers upreg(mRNA(p53)) • high(ARF) triggers translate(p53) • high(ARF) triggers bind(p53,mdm2)
Hypothesis • high(UV) triggers upregulate(mRNA(ARF)) • high(ARF), high(mdm2) triggers bind(ARF,mdm2)
Future Works • Automatic construction of hypothesis space • Extraction of facts like protein interactions … • Integration of knowledge from different sources • Consistency-based integration (HyBrow) • Ontologies • Heuristics for hypothesis search • Ranking of hypotheses • Make use of “number” data like microarray?