1 / 26

Exploiting Nonstationarity for Performance Prediction

Exploiting Nonstationarity for Performance Prediction. Christopher Stewart (University of Rochester) Terence Kelly and Alex Zhang (HP Labs). Motivation. Enterprise applications are hard manage Complex software hierarchy executes on (globally) distributed platforms

inesj
Download Presentation

Exploiting Nonstationarity for Performance Prediction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploiting Nonstationarity for Performance Prediction Christopher Stewart (University of Rochester) Terence Kelly and Alex Zhang (HP Labs)

  2. Motivation • Enterprise applications are hard manage • Complex software hierarchy executes on (globally) distributed platforms • Application-level performance metrics are more complicated than system-level metrics • Infrastructure is fragile; system modifications (even for measurement purposes) are not always practical for real applications

  3. Previous Work • Performance models ease the burden of system management • Reduce complex system configurations to end-user response time or throughput prediction • Achieved via kernel modification [barham-osdi-2004], runtime libraries [chandra-eurosys-2007], and controlled benchmarking [stewart-nsdi-2005,urgoankar-sigmetrics-2005] • Can we apply model-driven system management when intrusive measurement tools are impractical?

  4. Observation • Relative frequencies of transaction types in real enterprise applications are nonstationary • i.e., they change over time • Nonstationarity allows model calibration using passive observations of application-level performance and system metrics

  5. An Example • Desire the mean value of a metric for each transaction type • Nonstationarity allows for model calibration • Solve a set a linear equations: type A = 1 type B = 2 • Passive observations are sufficient to calibrate performance models for real systems

  6. Outline • Transaction mix nonstationarity is real • Investigate 2 production enterprise applications • Implications of nonstationarity • A performance model for real enterprise applications • Performance-aware server consolidation • Conclusion

  7. Commercial Applications • Codename: VDR • Internal business-critical HP application • Services HP users and external customers • 1 week trace • Codename: ACME • Large Internet retailer (circa 2000) • 5-day trace

  8. Fraction of 2nd Most Popular Fraction of Most Popular Nonstationarity in Real Applications • VDR Application • Relative frequency of the two most popular transaction types • Each point reflects an observation during a 5-minute interval • Almost every ratio is represented • Transaction-type popularity is not fixed

  9. Nonstationarity in Real Applications • ACME Application • Fraction of “add-to-cart” transactions in the ACME workload • Each point reflects an observation during a 5-minute window • Frequencies vary by 2 orders of magnitude 0 24 48 72 96 120 Time (hours)

  10. Implications of Nonstationarity • Performance models • A wide-range of transaction mixes is a first-order concern for real production applications • Models that consider only request rate are likely to provide poor predictive accuracy under real-world conditions

  11. Fraction of 2nd Most Popular Fraction of Most Popular Implications of Nonstationarity • Workload generators • Popular benchmarks (e.g., RUBiS and TPC-W) use first-order Markov models • First-order Markov models yield stationary mixes (in the long term) • RUBiS browse-mix shown • Rethink workload generation

  12. Outline • Transaction mix nonstationarity is real • A performance model for real enterprise applications • Passive observations in real applications • Model design • Model validation • Performance-aware server consolidation • Conclusion

  13. Model Overview • Measurements under real workloads are sufficient (with some analytics) to predict application-level performance • We will carefully build a model that can be calibrated from passive observations of response times and resource utilizations

  14. Passive Observations • Certain system metrics are easy-to-acquire and widely available in production environments • Response times, CPU, and disk utilizations are routinely collected by tools in commodity Operating Systems

  15. Model Design • Each term considers one aspect of response time • The first term considers service time • Nij - The count of transaction type j in interval i • j - Typical service time of transaction type j

  16. Model Design • The second term considers queuing delay • Uir - The utilization of resource r at interval i • i - The arrival rate of all transactions during interval i • Resource utilization is not known a priori • Independently calibrated as a function of transaction mix

  17. Model Calibration • For performance prediction, we must acquire j • The second term is constant for each interval i • Solve (minimize error) a set of linear equations • Regression technique: least absolute residuals (LAR) • Robust to outliers, no tunable parameters, maximizes retrospective accuracy

  18. 2000 1500 Sum of Response Times (sec.) 1000 500 0 0 500 1000 1500 2000 5-min intervals (in trace order) Model Validation • VDR trace • ½ for calibration • ½ for prediction • Our model robustly predicts past and future performance

  19. Model Validation CDF • VDR trace • Median Error • 7% calibrated set • 9% predicted set • ACME 12% median predictive error • An accurate model from passive observations 100% 80% 60% 40% 20% 0% 0% 50% 100% 150% Absolute Percentage Error | predict – actual | / actual

  20. Outline • Transaction mix nonstationarity is real • Performance prediction for real enterprise applications • Performance-aware server consolidation • Problem statement • Extending our model for server consolidation • Validation • Conclusion

  21. Problem Statement • Performance-aware server consolidation • Given passive observations of enterprise applications running separately • Predict post-consolidation performance for each application • For this work, the hardware platform does not change

  22. Performance-Aware Server Consolidation • Post-consolidation performance model • Application consolidation primarily affects the queuing delay for each application • Simplifying assumption: Post-consolidation utilization is the sum of pre-consolidation utilizations

  23. 100% 80% 60% 40% 20% 0% 0% 20% 40% 60% 80% Absolute Percentage Error | predict – actual | / actual Validation CDF • Experimental setup • RUBiS and StockOnline • Custom nonstationary workloads • Observed on ACME-variant • Consolidated on VDR-variant • 10-hour consolidation with 30 second measurement intervals • Passively calibrated model predicts post-consolidation performance Median error 6% and 11%

  24. Outline • Transaction mix nonstationarity is real • Performance prediction for real enterprise applications • Performance-aware server consolidation • Problem statement • Model-driven server consolidation • Validation • Conclusion

  25. Future Work • Performance prediction across multi-core processor configurations • Passive observations calibrate simple yet effective models of processor utilization • Performance anomaly depiction • Predictions are used to identify situations where performance does not match model expectations [stewart-hotdep-2006 , kelly-worlds-2005]

  26. Take Away Points • Transaction mix nonstationarity is a real phenomenon in production applications • Passive observations are sufficient to calibrate performance models • Passively calibrated performance models can guide system management decisions

More Related