1 / 49

Jan Lemeire December 19 th 2007

Learning Causal Models of Multivariate Systems and the Value of it for the Performance Modeling of Computer Programs. Jan Lemeire December 19 th 2007. Supervisor: Prof. dr. ir. Erik Dirkx. Learning causal models for the performance analysis of programs executed on various computer systems.

annot
Download Presentation

Jan Lemeire December 19 th 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Causal Models of Multivariate Systemsand the Value of it for the Performance Modeling of Computer Programs Jan Lemeire December 19th 2007 Supervisor: Prof. dr. ir. Erik Dirkx

  2. Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics. • The importance of qualitative properties. Causal Inference & Performance Analysis

  3. Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis

  4. What is Parallel Processing? Computational work: Ideally: Speedup = number of processors Parallel system Causal Inference & Performance Analysis

  5. Parallel Overhead • Speedup = 2.55 • Overhead = time the processors are not spending on useful work = lost processor cycles Causal Inference & Performance Analysis

  6. Overhead Analysis Impact of overhead on speedup Causal Inference & Performance Analysis

  7. Experimental Parallel Performance Analysis: Data Acquisition Causal Inference & Performance Analysis

  8. EPDA: Multivariate Analysis Causal Inference & Performance Analysis

  9. EVT Experimenten in animatie tonen (zonder (a) en (b) Intermezzo I: Causal Inference Causal Inference & Performance Analysis

  10. Causal Inference for PerformanceAnalysis Utility based on the following properties: • Dependency analysis: how variables relate. • Markov property. • A causal model corresponds to a decomposition. Causal Inference & Performance Analysis

  11. Execution of program gives cache misses x? 4 x? 4 datatype (integer, float, double,…) data size in Bytes Causal Inference & Performance Analysis

  12. Markov Property Correlated With information about the data size: Provides explanations Differentiate direct from indirect relations Causal Inference & Performance Analysis

  13. Can we Observe Causal Relations? ~ ??? OK, but: or Causal Inference & Performance Analysis

  14. What is Causality? A causal relation denotes a mechanism, that a variable is `produced’ by its causes. However… not directly observable. Mmmh Causality is a relic of a bygone age Bertrand Russell Judea Pearl But: we want to learn something about underlying system (goal of statistics) Causal Inference & Performance Analysis

  15. Second Cause ~ Causal Inference & Performance Analysis

  16. V-structure Property angle independent from gunpowder but dependent when distance is known Causal Inference & Performance Analysis

  17. Conditional Independencies Make Causal Inference Possible • From a causal structure follow conditional independencies, irrespective of the mechanisms. • Markov • V-structure Causal Inference & Performance Analysis

  18. Graph is a Description of Independencies • Graphical criterion: d-separation • Intuitive • Faithfulness property: independencies independencies in graph in reality Causal Inference & Performance Analysis

  19. Causal Structure Learning In two steps: • Undirected graph • Orientation Causal Inference & Performance Analysis

  20. Dit kan ook pas verder, bij bespreking van unique Result • Partially directed acyclic graph “We know what parts are unknown.” • Faithfulness assumption: all independencies follow from the causal structure Causal Inference & Performance Analysis

  21. Figuur opnieuw in png, zonder losless compression Experimental Results Contribution 1 (1) Automatic learning of accurate performance models (2) Model validation (3) Identification of unexpected dependencies (4) Explanations for outliers Causal Inference & Performance Analysis

  22. Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis

  23. Practical Causal Inference The following limitations had to be overcome: • Non-linear relations: form-free independence test • Mixture of continuous, discrete and categorical data: general independence test • Deterministic relations: augmented causal model and extended learning algorithms Causal Inference & Performance Analysis

  24. Form-Free and General Dependency Test • Example Y Pearson: Rxy=0.083 => X and Y linearly independent • Mutual information • Kernel density estimation X Y P(X, Y) X I(X;Y)=0.90 bits => dependent Causal Inference & Performance Analysis

  25. Deterministic Relations • Data sizeand data typeare information equivalent with respect to cache misses • During learning connect least complex relation Causal Inference & Performance Analysis

  26. Complexity Criterion Contribution 2a Correct models are learned under the Complexity Increase Assumption Causal Inference & Performance Analysis

  27. Dit moet erbij!! Details misschien niet? Reestablishment of Faithfulness Contribution 2b • Consequences are considered • Information equivalences • Independence and simplicity • D-separation extension • Faithful model: represents all independencies • Information is added to the model • Basic information equivalences Causal Inference & Performance Analysis

  28. Extension of PC Learning Algorithm Contribution 2c • Detection of information equivalences • Among information equivalent relations, the simplest one is chosen • Orientation rules remain the same Correct models are learned from data containing deterministic relations. Causal Inference & Performance Analysis

  29. Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis

  30. Causal Inference & Performance Analysis

  31. Jaartallen van scientists erbij zetten Inductive Inference • Occam’s Razor “Among equivalent models choose the simplest one.” William of Ockham BUT: Objective measure of complexity? Causal Inference & Performance Analysis

  32. Kolmogorov Complexity Kolmogorov Complexity of a binary string: the length of the shortest program that computes the string and halts Andrey Kolmogorov Causal Inference & Performance Analysis Applied to Occam’s Razor: “Select model that describes the observations minimally”

  33. Shortest Programs • 001001001001001001001001001001001 regularity of repetition allows compression • 011000110101101010111001001101000 random information = incompressible Causal Inference & Performance Analysis

  34. Randomness versus Regularity Kolmogorov Minimal Sufficient Statistics (KMSS): formal separation • 001001001001001001001001001001001 • 011000110101101010111001001101000 Only random information (incompressible) Meaningful information regularities Accidental information randomness repetition 11 times, 001 Causal Inference & Performance Analysis

  35. Learning = finding regularities = maximal compression Structure of a diamond Exact size random regularities random Causal Inference & Performance Analysis

  36. Meaningful Information of Probability Distributions Contribution 3a meaningful information(Theorem 1) Kolmogorov Minimal Sufficient Statistic if graph and CPDs are incompressible (Theorem 2) a graph with random CPDs is faithful (Theorem 4) Causal Inference & Performance Analysis

  37. Causal Aspect of Causal Models = Decomposition • Canonical decomposition:quasi-unique and minimal decomposition into atomic and independent components (the CPDs) • Corresponds to reality (mechanisms) Causal Inference & Performance Analysis

  38. Even more Figuurtje toevoegen van holisme en reductionisme Causal Component Relies on Reductionism • The world can be studied in parts. Or, even more: • The world is made up of indivisible parts. • When DAG of Bayesian network is a complete graph • no meaningful information • holism Causal Inference & Performance Analysis

  39. Validity of Causal Inference Contribution 3b How OK is the learned causal model? Do CPD components correspond to physical mechanisms? Minimal model? Faithful? Other regularities? Causal Inference & Performance Analysis

  40. Well-known Example of Unfaithfulness ’Normally’: A and D correlate A and D get independent if influences along paths 1 and 2 cancel each other out Mechanisms are related Regularity among them Causal Inference & Performance Analysis

  41. Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis

  42. Regularities are Qualitative Properties • Different from quantitative information. • Allow for qualitative reasoning. • Qualitative properties determine behavior. Causal Inference & Performance Analysis

  43. Communication Schemes on Network Topologies Communication time? Causal Inference & Performance Analysis

  44. Generic Performance Model Contribution 4a • Good predictions for combinations of random schemes and random topologies Causal Inference & Performance Analysis

  45. Met minder voordehandliggende figuurtjes tonen Broadcast niet in stervorm, shift in lijnvorm, torus toevoegen Combinations of Patterns Contribution 4b Performance depends on match! Causal Inference & Performance Analysis

  46. Qualitative Properties Faithfulness: ”graph should describe all independencies” KMSS: ”model should describe all regularities” Qualitative information Quantitative information explicitly describe regularities contains no more regularities Causal Inference & Performance Analysis

  47. Explicitly Mention Qualitative Properties! Causal Inference & Performance Analysis

  48. Conclusions • Contribution to performance analysis. • Automatic causal analysis. • Useful add-on in combination with other techniques. • The value of causal inference is underlined. • The importance of regularities or qualitative properties. Causal Inference & Performance Analysis

  49. Future Work • Application of the learned performance models for optimization. • Is the failure of generic performance models only due to regularities? • Augment models with qualitative properties. • But: how define, recognize and reason with regularities? Causal Inference & Performance Analysis

More Related