1 / 25

On applying pattern recognition to systems management

On applying pattern recognition to systems management. Moises Goldszmidt. High complexity of current/future systems. Expensive to come up with a closed form characterization of Behavior Interrelationship between components Dynamic nature of Workload/inputs

pelton
Download Presentation

On applying pattern recognition to systems management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On applying pattern recognition to systems management Moises Goldszmidt

  2. High complexity of current/future systems • Expensive to come up with a closed form characterization of • Behavior • Interrelationship between components • Dynamic nature of • Workload/inputs • Infrastructure (software/hardware) • Opacity • Layers of abstraction (virtualization) • OEMS

  3. Expensive closed form characterization Dynamics Opacity Cheap automatic characterization Adaptation Induction of mappings and estimation of state A proposal… Raw data Features P(rt|x) Decisions ObserveSystem InduceModels PerformInferences

  4. Issues… • (Automatic) evaluation of models • Accuracy • Percentage of patterns captured • False positives vs false negatives • Decision making power • Uncertainty and confidence • Calibration • Amount of data • Decisions about model parameterization, tradeoffs between complexity and computation, overfitting and generalization • Uncertainty!

  5. Hope… • Advances in datamining, machine learning, computational statistics… • Representation • Computation • Computational power • Search • Matrix inversion • Numerical techniques

  6. Inducing models of black box storage arrays • Ira Cohen, Kim Keeton, Terence Kelly • Problem: • Given a trace of I/O response times of an XP512 • A specification for “fast” and “slow” • Forecast the response time (fast or slow) of any individual I/O request • Obstacles: Array is a black box • Applications: • Scheduling – serving compound web pages • Performance monitoring and anomaly detection

  7. System under study

  8. Methodology • Collect data (training set) • Induce probabilistic model • Priors • Mixture of regressions (MOR) • Naïve Bayes Classifier (NBC) • Provide decision procedure • Evaluation of the models on “unseen” data

  9. Priors based model • Model = P(rt) • Decision procedure: given threshold for fast • If P(rt < fast) >50% then announce fastotherwise slow • Note: This forecast is constant and is independent of other characteristics of the input • Complexity of algorithms and computation • Trivial

  10. cache simulators other features Several linearrelationshipsdepending oncache MOR Model cache responsetime • Model P(rt|cs,of) = Sc P(rt|of,c)*P(c|cs) • Decision procedure = given threshold for fast - If P(rt > t| cs,of) > 50% then announce fast otherwise announce slow

  11. NBC Model • Induce a model based on the threshold t of fast and slow • Model P(t|cs,of) = Pi P(ofi|t)*Pj P(csj|t)*a • Decision procedure: P(fast|cs,of)> P(slow|cs, of) t cache simulators other features

  12. Evaluating models • Classification power: Did the model + decision procedure captured the patterns accurately? • Accuracy = percentage of correct predictions • Appropriate for anomaly detection • As decision makers: What is the confidence/risk on each decision? • Utility based: Pay according to confidence on each decision • Brier score = Sx (slowx – P(slow|x))2 • Appropriate for scheduling decisions • How much data: • When can we trust the model?

  13. Accuracy results

  14. Accuracy per RAID group

  15. Classifiers as decision makers • Brier Sx (slowx – Pm(slow|x))2 = calibration + refinement • Calibration: if Pm(slow|x) = 10%, then E[ P(slow| Pm(slow|x))] = 10% • Refinement: How close is the forecast to being certain

  16. On being calibrated • We can use the P as a measure of confidence • Refinement establishes a bound on the Bayes error • Accuracy may improve: • Threshold of 50% is optimal for real P • Calibration brings estimates of the model closer to real P • Calibration procedure: (DeGroot) • Map estimated P to trained set P Work with Ira Cohen

  17. NBC before calibration

  18. After calibration Number Training: 686091, Number Test: 343046. RG3, Days 27-30.

  19. Learning curves: accuracy

  20. Learning curves: calibration

  21. Learning curves: refinement

  22. A dialogue…. • Sys: Awesome, lets forecast whether a 3-tier system will meet its SLO – What should I measure? • Pat: Measure everything!! We will then establish a search over the measurements according to one of the different scores • Sys: Can you tell me whether the system will meet the SLO? • Pat: I can tell you the probability that the system will meet the SLO. Uncertainty is a fact of your world. I can provide decision procedures to deal with it • Sys: What happens if the workload changes, or the metrics change? • Pat: then my model P will change • Sys: I can characterize the statistics of the workload and maybe other things… • Pat: Great! I can incorporate these characterizations in my models and decision making procedures

  23. Summary/discussion • Presented statistical pattern recognition as a worthwhile approach for decision making in the context of current/future infrastructure • Presented a specific example and provided perspective on the issues • EVALUATE YOUR MODELS!!! • Benefits for systems • Deal with characterization issues, dynamics, opacity • Benefits for SPR • New application domain  force new developments (results about calibration) • Other applications: • SLO characterization and diagnosis in 3-tier systems • ROC ???

  24. Quote slide “Seguro esta el cielo que no lo caga zamuro.” Juan Bimba Venezuela

  25. HP logo

More Related