1 / 15

Computational Risk Management for Building Highly Reliable Network Services

HotDep’05. Computational Risk Management for Building Highly Reliable Network Services. Chaki Ng Brent N. Chun Philip Buonadonna. Network Service Performance. Desire for Hard Performance Guarantees “99.999% availability,” “all trades < 30 seconds” Difficult to Achieve Consistently

jud
Download Presentation

Computational Risk Management for Building Highly Reliable Network Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HotDep’05 Computational Risk Management for Building Highly Reliable Network Services Chaki NgBrent N. Chun Philip Buonadonna

  2. Network Service Performance • Desire for Hard Performance Guarantees • “99.999% availability,” “all trades < 30 seconds” • Difficult to Achieve Consistently • Demand: workload varies and can be bursty • Supply: resource needs vary and hard to plan for • Dedicated and Over-Provisioning • $$$, low utilization • Shared Infrastructure • Resource supply varies – competition, failures • Tradeoff supply and performance guarantees Chaki Ng || Computational Risk Management

  3. Computational Service Provider (CSP) • Goal: mechanism to manage supply • Resources (e.g. server nodes) • Accommodate peak demand of most services • Markets of nodes • Each node sells resource contracts • Spot, futures, options • Contracts priced based on supply and demand Chaki Ng || Computational Risk Management

  4. Measure Risk • How to quantify performance guarantees • Risk metrics: simple statistical summaries of undesirable outcomes • Example: Value-at-Risk (VaR) • Finance: “The Fidelity mutual fund will lose no more than $25MM monthly, with 95% probability” • Computation: “Amazon.com will process orders in less than 30 seconds daily for 95% of all orders” • Two challenges: calculate VaR and sensitivity analysis of VaR Chaki Ng || Computational Risk Management

  5. Probability Probability 95% Var: -$27MM 95% Var: 33 seconds 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 Fidelity Fund Profit/Loss Amazon.com Order Time Calculate VaR • Calc expected performance distribution • Example method: historical • Methods: Variance, Monte Carlo, Stress Testing Chaki Ng || Computational Risk Management

  6. Compute VaR: Model Supply and Demand Own ServiceWorkload Forecast Supply Set of Accessible Node Resources VaR Node Performance and Trade Forecast Aggregate Workload Forecast Chaki Ng || Computational Risk Management

  7. Sensitivity Analysis of VaR • Goal: model how VaR varies as the set of resource contracts changes • VaR = F(set of resource contracts) • Forecast demand and supply • Nodes and aggregate workload forecast • Own client workload forecast • Model portfolio VaR • Swap set of resource contracts • Calculate VaR improvements Chaki Ng || Computational Risk Management

  8. Portfolio Management • Goal: meet target VaR within budget and minimal cost • Continuous portfolio optimization • Find available set of resources • Find sets that achieve best VaR • Trade resource contracts • Buy best set within budget Chaki Ng || Computational Risk Management

  9. MSFT ORCL Probability Fidelity Profit/Loss Finance: Manage Portfolio VaR VaR Portfolio EBAY IBM 95% Var: -$27MM Sell IBM @ $75 Buy EBay @ $37 Target VaR: “The Fidelity mutual fund will lose no more than $25MM monthly with 95% probability.” Financial Markets Chaki Ng || Computational Risk Management

  10. 95% Var: 33 seconds Node2 Node3 Probability 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 Amazon.com Order Time Computation: Manage Portfolio VaR VaR Portfolio Node4 Node1 Sell Node1 @ $50 Buy Node4 @ $30 Target VaR: “Amazon.com will process orders in less than 30 seconds for 95% of all orders.” CSP Chaki Ng || Computational Risk Management

  11. Open Problems • Resource Contracts: pricing, base units • Programming: model, API • Modeling Supply and Demand • Portfolio Strategies: “standard portfolios” • Interoperability: across different CSPes Chaki Ng || Computational Risk Management

  12. Conclusion • Dedicated vs. shared • CSP: share resources via markets • Achieve performance goals in the context of shared CSP • Quantify performance goal via risk metrics like VaR • Calculation and sensitivity analysis • Portfolio optimization Chaki Ng || Computational Risk Management

  13. Backup Slides Chaki Ng || Computational Risk Management

  14. Simple Experiment Service Workload Failover Node Failures Each request tries N nodes randomly If both nodes down  failed request  Successful Requests Daily Service Availability =  All Requests Chaki Ng || Computational Risk Management

  15. Results • Each point: 100 daily runs, 100 requests/hr Chaki Ng || Computational Risk Management

More Related