230 likes | 331 Views
Graphical Causal Models: Determining Causes from Observations. William Marsh Risk Assessment and Decision Analysis (RADAR) Computer Science. RADAR Group, Computer Science. Risk Assessment and Decision Analysis Research areas Software engineering, safety, finance, legal
E N D
Graphical Causal Models: Determining Causes from Observations William Marsh Risk Assessment and Decision Analysis (RADAR) Computer Science
RADAR Group, Computer Science • Risk Assessment and Decision Analysis • Research areas • Software engineering, safety, finance, legal • A new initiative in medical data analysis: DIADEM Norman Fenton Group leader Martin Neil http://www.dcs.qmul.ac.uk/researchgp/radar/
Outline • Graphical Causal Models • Bayesian networks: prediction or diagnosis • Causal induction: learning causes from data • Causal effect estimation: strength of causal relationships from data • DIADEM project
Aim to assist early detection of asthma episodes in Paediatric A&E Using only data already available electronically Network created by Experts Data Detecting Asthma Exacerbations
Prior probability of A Revised belief about A, given evidence B Factor to update belief about A, given evidence B Bayes’ Theorem Joint probability
Bayes’ Theorem (Made Easy) • A person has a positive test result • How likely is it they are infected? • 17% yes, no Infection rate: P(I) = 1% Infection False positive P(T=pos|I=no) = 5% Negligible false negative pos, neg Test
Medical Uses of BNs • Diagnosis • Differential diagnosis from symptoms • Prediction • Likely outcome • Building a BN • From expert knowledge expert system • From data data mining
Joint probability same: Cause versus Association • Both represent fever infection association • ‘Causal model’ has arrow from cause to effect Infection Fever ? or Fever Infection
B B C C A A Causal Induction • Discover causal relationships from data • Sometimes distinguishable • … different conditional independence
Causal Induction – Application • Discover causal relationships from data • Need lots of data • Applied to gene regulatory networks • Data from micro-array experiments • Recent explanation of limitations
B A Estimating Causal Effects • Suppose A is a cause of B • What is the causal effect? • Is it p(B | A) ?
intelligence sport exam result Benefits of Sports? • Is there a relationship between sport and exam success? • Data available • ‘Intelligence’ correlate • Is this the correct test? P(exam=pass|sport) > P(exam=pass| no-sport)
observe Benefits of Sports? intelligence • When we condition on ‘sport’ • Probability for ‘exam result’ • Probability for ‘intelligence’ changes • What if I decide to start sport? sport exam result p(pass|sport) > p(pass| no-sport) 67% 73%
change Intervention v Observation intelligence • Causal effect differs from conditional probability • Mostly interested in consequence of change • Causal effects can be measured by a Randomised Control Trial • Causal effect of sport on exam results not identifiable sport exam result P(pass|do(sport)) < P(pass| do(no sport))
Benefit of Sport • New observable variable ‘attendance at lectures’ • Causal effect of sport on exam results now identifiable intelligence sport (S) attendance (A) exam result (E)
Estimating Causal Effects • Rules to convert causal to statistical questions • Generalises e.g. stratification, potential outcomes • Assumptions: a causal model • Some assumptions may be testable • Causal model • Some variables observed, others not measured • Some causal effects identifiable • Challenges • Causal models for complex applications • Statistical implications
Example Application • Royal London trauma service • Criteria for activation of the trauma team • Aim to prevent unnecessary trauma team calls • Extensive records of trauma patient outcomes • US study of 1495 admissions proposed new ‘triage’ criteria • Significant decrease in overtriage 51% 29% • Insignificant increase in undertriage 1% 3% • None of the patients undertriaged by new criteria died • Does this show safety of new criteria?
Digital Economy in Healthcare • Data Information and Analysis for clinical DEcision Making • EPSRC Digital Economy • Cluster • Partnership between solution providers and clinical data analysis problem holders • Summarise unsolved data analysis needs, in relation to the analysis techniques available Join the DIADEM cluster
Cluster Activities and Outcomes • Engage stakeholders and build a community: • Creation of a community web-site and forum • Meetings with potential ‘problem holders’ • Workshops • A road map: data and information • Follow-up proposal • A self-sustaining website – health data analytics
Summary Join the DIADEM cluster • Bayesian networks • Prediction and diagnosis • Causal induction • Identify (some) causal relationships from (lots of) data • Causal effects • Experimental results from … • … non-experimental data • … assumptions (causal model)