490 likes | 583 Views
Learning Causal Models of Multivariate Systems and the Value of it for the Performance Modeling of Computer Programs. Jan Lemeire December 19 th 2007. Supervisor: Prof. dr. ir. Erik Dirkx. Learning causal models for the performance analysis of programs executed on various computer systems.
E N D
Learning Causal Models of Multivariate Systemsand the Value of it for the Performance Modeling of Computer Programs Jan Lemeire December 19th 2007 Supervisor: Prof. dr. ir. Erik Dirkx
Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics. • The importance of qualitative properties. Causal Inference & Performance Analysis
Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis
What is Parallel Processing? Computational work: Ideally: Speedup = number of processors Parallel system Causal Inference & Performance Analysis
Parallel Overhead • Speedup = 2.55 • Overhead = time the processors are not spending on useful work = lost processor cycles Causal Inference & Performance Analysis
Overhead Analysis Impact of overhead on speedup Causal Inference & Performance Analysis
Experimental Parallel Performance Analysis: Data Acquisition Causal Inference & Performance Analysis
EPDA: Multivariate Analysis Causal Inference & Performance Analysis
EVT Experimenten in animatie tonen (zonder (a) en (b) Intermezzo I: Causal Inference Causal Inference & Performance Analysis
Causal Inference for PerformanceAnalysis Utility based on the following properties: • Dependency analysis: how variables relate. • Markov property. • A causal model corresponds to a decomposition. Causal Inference & Performance Analysis
Execution of program gives cache misses x? 4 x? 4 datatype (integer, float, double,…) data size in Bytes Causal Inference & Performance Analysis
Markov Property Correlated With information about the data size: Provides explanations Differentiate direct from indirect relations Causal Inference & Performance Analysis
Can we Observe Causal Relations? ~ ??? OK, but: or Causal Inference & Performance Analysis
What is Causality? A causal relation denotes a mechanism, that a variable is `produced’ by its causes. However… not directly observable. Mmmh Causality is a relic of a bygone age Bertrand Russell Judea Pearl But: we want to learn something about underlying system (goal of statistics) Causal Inference & Performance Analysis
Second Cause ~ Causal Inference & Performance Analysis
V-structure Property angle independent from gunpowder but dependent when distance is known Causal Inference & Performance Analysis
Conditional Independencies Make Causal Inference Possible • From a causal structure follow conditional independencies, irrespective of the mechanisms. • Markov • V-structure Causal Inference & Performance Analysis
Graph is a Description of Independencies • Graphical criterion: d-separation • Intuitive • Faithfulness property: independencies independencies in graph in reality Causal Inference & Performance Analysis
Causal Structure Learning In two steps: • Undirected graph • Orientation Causal Inference & Performance Analysis
Dit kan ook pas verder, bij bespreking van unique Result • Partially directed acyclic graph “We know what parts are unknown.” • Faithfulness assumption: all independencies follow from the causal structure Causal Inference & Performance Analysis
Figuur opnieuw in png, zonder losless compression Experimental Results Contribution 1 (1) Automatic learning of accurate performance models (2) Model validation (3) Identification of unexpected dependencies (4) Explanations for outliers Causal Inference & Performance Analysis
Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis
Practical Causal Inference The following limitations had to be overcome: • Non-linear relations: form-free independence test • Mixture of continuous, discrete and categorical data: general independence test • Deterministic relations: augmented causal model and extended learning algorithms Causal Inference & Performance Analysis
Form-Free and General Dependency Test • Example Y Pearson: Rxy=0.083 => X and Y linearly independent • Mutual information • Kernel density estimation X Y P(X, Y) X I(X;Y)=0.90 bits => dependent Causal Inference & Performance Analysis
Deterministic Relations • Data sizeand data typeare information equivalent with respect to cache misses • During learning connect least complex relation Causal Inference & Performance Analysis
Complexity Criterion Contribution 2a Correct models are learned under the Complexity Increase Assumption Causal Inference & Performance Analysis
Dit moet erbij!! Details misschien niet? Reestablishment of Faithfulness Contribution 2b • Consequences are considered • Information equivalences • Independence and simplicity • D-separation extension • Faithful model: represents all independencies • Information is added to the model • Basic information equivalences Causal Inference & Performance Analysis
Extension of PC Learning Algorithm Contribution 2c • Detection of information equivalences • Among information equivalent relations, the simplest one is chosen • Orientation rules remain the same Correct models are learned from data containing deterministic relations. Causal Inference & Performance Analysis
Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis
Jaartallen van scientists erbij zetten Inductive Inference • Occam’s Razor “Among equivalent models choose the simplest one.” William of Ockham BUT: Objective measure of complexity? Causal Inference & Performance Analysis
Kolmogorov Complexity Kolmogorov Complexity of a binary string: the length of the shortest program that computes the string and halts Andrey Kolmogorov Causal Inference & Performance Analysis Applied to Occam’s Razor: “Select model that describes the observations minimally”
Shortest Programs • 001001001001001001001001001001001 regularity of repetition allows compression • 011000110101101010111001001101000 random information = incompressible Causal Inference & Performance Analysis
Randomness versus Regularity Kolmogorov Minimal Sufficient Statistics (KMSS): formal separation • 001001001001001001001001001001001 • 011000110101101010111001001101000 Only random information (incompressible) Meaningful information regularities Accidental information randomness repetition 11 times, 001 Causal Inference & Performance Analysis
Learning = finding regularities = maximal compression Structure of a diamond Exact size random regularities random Causal Inference & Performance Analysis
Meaningful Information of Probability Distributions Contribution 3a meaningful information(Theorem 1) Kolmogorov Minimal Sufficient Statistic if graph and CPDs are incompressible (Theorem 2) a graph with random CPDs is faithful (Theorem 4) Causal Inference & Performance Analysis
Causal Aspect of Causal Models = Decomposition • Canonical decomposition:quasi-unique and minimal decomposition into atomic and independent components (the CPDs) • Corresponds to reality (mechanisms) Causal Inference & Performance Analysis
Even more Figuurtje toevoegen van holisme en reductionisme Causal Component Relies on Reductionism • The world can be studied in parts. Or, even more: • The world is made up of indivisible parts. • When DAG of Bayesian network is a complete graph • no meaningful information • holism Causal Inference & Performance Analysis
Validity of Causal Inference Contribution 3b How OK is the learned causal model? Do CPD components correspond to physical mechanisms? Minimal model? Faithful? Other regularities? Causal Inference & Performance Analysis
Well-known Example of Unfaithfulness ’Normally’: A and D correlate A and D get independent if influences along paths 1 and 2 cancel each other out Mechanisms are related Regularity among them Causal Inference & Performance Analysis
Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis
Regularities are Qualitative Properties • Different from quantitative information. • Allow for qualitative reasoning. • Qualitative properties determine behavior. Causal Inference & Performance Analysis
Communication Schemes on Network Topologies Communication time? Causal Inference & Performance Analysis
Generic Performance Model Contribution 4a • Good predictions for combinations of random schemes and random topologies Causal Inference & Performance Analysis
Met minder voordehandliggende figuurtjes tonen Broadcast niet in stervorm, shift in lijnvorm, torus toevoegen Combinations of Patterns Contribution 4b Performance depends on match! Causal Inference & Performance Analysis
Qualitative Properties Faithfulness: ”graph should describe all independencies” KMSS: ”model should describe all regularities” Qualitative information Quantitative information explicitly describe regularities contains no more regularities Causal Inference & Performance Analysis
Explicitly Mention Qualitative Properties! Causal Inference & Performance Analysis
Conclusions • Contribution to performance analysis. • Automatic causal analysis. • Useful add-on in combination with other techniques. • The value of causal inference is underlined. • The importance of regularities or qualitative properties. Causal Inference & Performance Analysis
Future Work • Application of the learned performance models for optimization. • Is the failure of generic performance models only due to regularities? • Augment models with qualitative properties. • But: how define, recognize and reason with regularities? Causal Inference & Performance Analysis