gR2002

gR2002 Peter Spirtes Carnegie Mellon University

Graphs often given causal interpretation • Graphs can be used to represent both causal hypotheses and probability distributions • e.g. in a directed acyclic graph (DAG) A  B means A is a direct cause of B • DAG also represents a set of distributions sharing conditional independence relations • Causal interpretation is common in social science applications (structural equation modelling) • Causal representation of genetic regulatory networks

TETRAD • Dedicated to search for causal models under a variety of different assumptions about what is known • Has several different kinds of graphs, depending upon background assumptions • Has a number of different kinds of search strategies • Allows some explicit representation of background knowledge • Has some modules for calculating equivalence class of given graph • Recently developed graphical interface • Should have module for calculating effects of interventions

The causal interpretation of graphical models suggests several unusual operations • Calculation of effect of manipulation • Calculation of equivalence class (aid to calculation of effect of manipulation)

Kinds of graphical models in TETRAD • Directed acyclic graphs (discrete, normal) • Directed cyclic graphs (normal) • Pattern • Mixed ancestral graphs (normal) • Partial ancestral graphs

Difference between calculation of manipulation versus conditioning • In conditioning, the result depends only upon the joint distribution and the event conditioned on, • In manipulating, the results depend upon the joint distribution, the event manipulated, and the causal relations among the variables • This means that locating alternative good models is essential for correct prediction of manipulation

Conditioning P(Lung Cancer = yes|Smoking = yes) = ¾

Manipulating Smoking – First Step

Manipulating Smoking – After waiting P(Lung Cancer = yes||Smoking = yes) = ¾ = P(Lung Cancer = yes|Smoking = yes) = ¾

Calculation of effect of manipulation • When there are no latent variables and structure is known - simple • When there are latent variables and the structure is known (Pearl 2001) • When the structure is partially known (SGS 2001)

Calculation of Effect of Manipulation – Equivalence Class A B C D Pattern A B C D G2 A B C D G1 G1 and G2 represent the same distribution, agree on the effect on D of manipulating B, disagree about the effect on A of manipulating B

Calculation of Effect of Manipulation – Equivalence Classes A B C D PAG A B C D Pattern A B C D G2 A B C D G1 o o o o Pattern represents the equivalence class of DAGs if there are no latent variables. PAG represents the equivalence class of DAGs if there might be latent variables.

Edge types in different graphs •  •  •  • oo • oo • combinations of edges subject to varying constraints

The Statistical Theory for some graphical models is only partially worked out • MAGs and PAGs • know how to parameterize in linear cases • may not be a unique maximum likelihood estimate • PAG – not known how to efficiently determine if arbitrary combination of edges is PAG

Specific searches • Assuming no latent variables or cycles • Hill climbing – BIC, posterior probability (normal, discrete) • Constraint based – PC (normal, discrete) • Combined • Assuming no cycles • Hill climbing – BIC (normal) • Constraint based – FCI (normal, discrete)

Other features • Estimate parameters - DAGs (discrete, normal) • Representation of background knowledge • Find equivalence class of given DAG (no latents, possibly cyclic) • Graphical interface • Should have module to calculate effects of manipulations • Known structure, no latents • Known structure, latent variables • Partially known structure

As a probabilistic model graphical models require usual operations • As a probabilistic model, it requires the usual set of procedures • Search • Estimation • Testing • Scoring • Conditioning

Summary • The causal interpretation of graphical models offers an opportunity to provide functionality not found in most other kinds of models (e.g. predicting affects of manipulations)

Summary • Added functionality, different domains and different background knowledge require a variety of different kinds of graphical models • desirability of flexibility in graphical representation • desirability of allowing each type to inherit as much as possible from more general representations

Summary • Because of the need to locate good alternative models • Search plays a very important role (score-based, constraint-based, and combinations) • Calculating equivalence classes is essential • Collection and representation of background knowledge to guide search is very important

gR2002

gR2002

Presentation Transcript

‘what I am after’ from gR2002