260 likes | 368 Views
Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan Pei, Jia Wang, AT&T Labs -Research. Outline. Motivation Problem Definition Monitor Setup Single-round monitoring Multi-round monitoring
E N D
Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational ConstraintsYao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan Pei, Jia Wang, AT&T Labs -Research
Outline • Motivation • Problem Definition • Monitor Setup • Single-round monitoring • Multi-round monitoring • Evaluation • Related Works • Conclusion
Motivation VPN2,Site2 CE VPN1,Site2 VPN Backbone CE PE PE CE PE VPN2,Site1 VPN1, Site1 CE
Motivation • VPN performance monitoring • Reliability • Quality of service (SLA) • Approaches • Passive measurements: SNMP-based Monitoring • Fixed poll rate • Difficult to measure end-to-end path-level features (e.g. delay, bw) • Active measurements • Operational constraints • E.g. monitor, link, path constraints
Problem Definition VPN1,Site2 Each monitor can measure <=c paths CE VPN2,Site2 • Challenges with operational constraints • Optimization problem → constraint satisfactory problem • All paths measured simultaneously? VPN Backbone CE Each replier can reply <=r paths PE Each link is on <=b measured paths PE X CE PE VPN2,Site1 VPN1, Site1 Traffic isolation between VPNs CE Goal: continuously monitoring and diagnosing VPN performance under operational constraints
VScope System Architecture • Two phases: • VScope Setup • VScope Operation: monitoring + diagnosis Provides a smooth tradeoff between measurement frequency and monitors deployment/management costs
Two Phases • Monitor setup phase • From certain monitor candidates, how to select minimal number of monitors, which in the measurement phase can measure a selected set of paths that covers all links in the network under the given measurement constraints? • NP-hard even without considering constraints • Monitoring and fault diagnosis phase • When faulty paths are discovered in the path monitoring phase, how to quickly select some paths under the operational constraintsto be further measured so that the faulty link(s) can be accurately identified?
Two Phases • Monitor setup phase • From certain monitor candidates, how to select minimal number of monitors, which in the measurement phase can measure a selected set of paths that covers all links in the network under the given measurement constraints? • NP-hard even without considering constraints • Monitoring and fault diagnosis phase • When faulty paths are discovered in the path monitoring phase, how to quickly select some paths under the operational constraintsto be further measured so that the faulty link(s) can be accurately identified?
Outline • Motivation • Problem Definition • Monitor Setup • Single-round monitoring • Multi-round monitoring • Evaluation • Related Works • Conclusion
… … … … … … … … … … … … P2 P3 P4 P1 P3 P6 P6 P2 P5 P5 P4 P1 Monitoring Strategies Round 1 Round 1 Round 2 t t Multi-Round Monitoring Single-Round Monitoring
Multi-Round Monitoring • Pros • Relax tight constraints • Reduce number of monitors • Cons • Less monitoring frequency • Monitor Selection Algorithm • Consider R rounds of back-to-back measurements • Step 1: convert multi-round monitor selection problem to single-round problem and solve the single-round monitor selection problem • Relax monitor & link bw constraints by a factor of R • Step 2: schedule paths measured in R rounds
Single-Round Monitor Selection • Monitor Selection Problem • Related to Minimum Set Cover problem • NP-hard without constraints [Bejerano, Infocom03] • Pure Greedy Algorithm • Simple and locally optimized • Greedy Assisted Integer Linear Programming based algorithm • Linear programming is good at dealing with constraints • ILP is NP-hard • Need to relax ILP to LP
Pure Greedy Algorithm • Two-level nested Minimum Set Cover Problem and Maximum Coverage Problem • Iteratively select a candidate router as a new monitor that can measure paths covering maximum number of un-covered links before the selection • Computing the maximum gain of adding a router as a monitor is a variant of Maximum Coverage problem (also NP-hard) • Iteratively select a path of the router that • will not violate the link bandwidth constraints and • covers maximum number of un-covered links before the selection • Until the number of selected paths reaches the monitor’s constraint
Integer Linear Programming A path is monitored iff the source router is selected as monitor A link is covered if at least one path containing the link is selected Minimize number of monitors Link bandwidth constraint Monitor constraint Replier constraint It is NP-hard!
Relaxation with Random Rounding • Relax the Integer Linear Programming to Linear Programming • Suppose the solution of linear programming is x*i, y*i • Rounding rule:
Greedy Assisted Linear Programming • Use Linear Programming to select a set of monitors and corresponding measurement paths • Not all links are covered • Use greedy algorithm to cover uncovered links • Similar to the pure greedy algorithm
Mulit-Round Path Scheduling • NP-hard • Can reduce minimum graph coloring problem to path scheduling problem • Three algorithms • Random algorithm • Randomly schedule paths independently • Run random algorithm multiple times to get the best one • Greedy algorithm • Minimize link utilization in every step • LP based Randomization Algorithm • ILP + relaxation and random rounding • Optimization metrics • Maximum link violation degree (MLVD) • Average link violation degree (ALVD)
Outline • Motivation • Problem Definition • Monitor Setup • Single-round monitor selection • Multi-round monitor selection • Evaluation • Related Works • Conclusion
Evaluation • Topologies • Synthetic topologies generated by BRITE • Real topologies from a tier-1 ISP: one IP backbone topology (IP-EX), one VPN backbone topology (VB), and two VPN infrastructure topologies (V1-EX, V2-EX) • Scale from 100s nodes to 100,000s nodes • Heterogeneous real link bw (1.54Mbps ~ 10Gbps) • Operational constraints • From ISP management team • E.g. percent link bw allowed for probing: 1% • Evaluation metrics • Percentage of monitors selected • Maximum (average) link violation degree after scheduling • Running speed
Experimental setup • Default configuration • Monitor constraint = 12 • Replier constraint = 24 • Probing rate per path = 4 pkt /sec • Measurement BW consumed per path = 1.6Kbps • Link constraint = 1% x (link capacity)
Baseline Monitor Selection Results (VB Topology) Vary Monitor Constraint Vary Link Constraint LP+Greedy selects fewer monitors.
Multi-Round Monitor Selection Results (V1-EX Topology) Vary Monitor Constraint Vary Link Constraint More rounds and fewer monitors and diminishing returns.
Greedy Greedy Multi-Round Monitor Selection Results (V1-EX Topology) Link violation degree Percentage of links with violation Random Random LP LP LP > Random > Greedy.
Related Work • Path Selection • The monitoring problem is not considered or too simple • Complex path selection goal (basis, SVD, Bayesian experimental design) • Monitoring Placement • Active Monitoring Systems • Similar problem without operational constraints [Bejerano, Infocom03] • Robustness consideration • Passive Monitoring Systems • SNMP Polling • Traffic sampling
Conclusions • VScope for continuously monitoring & diagnosis • Consider operational constraints • Design multi-round monitor selection algorithms • Single-round monitor selection • Monitoring path scheduling • Evaluated with synthetic and real topologies • Our algorithms are efficient in minimizing number of monitors with low constraint violation
Q & A? Thanks!