240 likes | 428 Views
Measuring Service in Multi-Class Networks. Aleksandar Kuzmanovic and Edward W. Knightly Rice Networks Group. http://www.ece.rice.edu/networks. QoS services SLA guaranteed rate Ex. Class X serviced at minimum rate R Relative performance Ex. Class X has strict priority over class Y
E N D
Measuring Service in Multi-Class Networks Aleksandar Kuzmanovic and Edward W. Knightly Rice Networks Group http://www.ece.rice.edu/networks
QoS services SLA guaranteed rate Ex. Class X serviced at minimum rate R Relative performance Ex. Class X has strict priority over class Y Statistical service Ex. P(class X pkt. Delay>100ms)<.001 QoS mechanisms Priority queues Rate-based, delay-based... Policing Rate limiting... Over-engineering Just add more bandwidth... Background Need:Tools for network clients to assess the networks QoS capabilities
Inverse QoS Problem • Is a class rate limited? • What is the inter-class relationship? • Fair/weighted fair/strict priority • Is resource borrowing fully allowed or not? • Is the service’s upper bound identical to its lower bound? • What are the service’s parameters?
Applications - Network Example Providers reluctant to divulge precise QoS policy (if any...) • SLA validation for VPNs • Is the SLA fulfilled? • Capacity planning • What is the relationship among classes? • Edge-based admission control [CK00] and implementation [SSYK01]
Performance Monitoring and Resource Management • Single WEB server • CPU resource sharing • Listen queue differentiation • Admission control • Distributed WEB server • Load balancing • Internet Data Center • Machine migration Goal:Estimate a class’ net “guaranteed rate”
“Off-Line” Solution is Simple • Consider a router with unknown QoS mechanisms
“On-Line” Case: Operational Network • Undesirable to disrupt on-going services • High rate probes to detect inter-class relationships would degrade performance • Impossible to force other classes to be idle • … to detect policers
System Model and Problem Formulation • Two stage server • Non-work conserving elements • Multi-class scheduler • Observations • Arrival and departure times • Class ID • Packet size
Determine... • Infer the service discipline • Most likely hypothesis among WFQ, EDF and SP • Detect the existence of non-work conserving elements • Rate limiters (ex. leaky bucket policers) • Estimate the system parameters • WFQ guaranteed rates, EDF deadlines, rate limiter values
Remaining Outline • Inter-class Resource Sharing Theory • Empirical Arrival and Service Models • MLE of Parameters • EDF/WFQ/SP Hypothesis Testing • Simulation Results and Conclusions
Theoretical Tool: Statistical Service Envelopes [QK99] • General statistical char. for a (virtual) minimally backlogged flow • Flows receive additional service beyond min rate • Function of other flow demand • Function of scheduler • General characterization of inter-class resource sharing • Framework for admission control for EDF/WFQ/SP
Strategy • Inter-class theory • Key technique: • Passively monitor arrivals and services at edges • Devise hypothesis tests to jointly: • Detect most likely hypothesis • Estimate unknown parameters
E*( I ) = 3 time t t + I Empirical Arrival Model • Envelopes characterize arrivals as a function of interval length • Statistical traffic envelope [QK99] • Empirical envelope - measure first two moments of arrivals over multiple time scales Goal: assuming Gaussian distribution for B
Empirical Service Model • A real-world paradigm for statistical service envelope • Observe: Service can be measured only when packets are backlogged
Empirical Service Distributions • For each class and time scale • Expected service distributions • Service measures (data) • Empirical service distributions WFQ (400 ms) SP (400 ms)
Parameter Estimation andScheduler Inference • GLRT for each time scale • Under MLE parameters for each scheduler • Choose most likely scheduler • Apply majority rule over all time scales
EDF/WFQ Testing • Correctness ratio True WFQ 94% True EDF 100% Importance of time scales • Short time scales • Fluid vs. packet model • Long time scales • Ratio of delay shift and time scale decreases as time scale increases (d1=25ms)
Measurable Regions • What if there is no traffic in particular class? • What traffic load “allows” inferences? • Region where we are able to estimate true value within 5% • Typical utilization should be > 62% for 1.5 Mbps link • Otherwise, active probing required
Conclusions • Framework for clients of multi-class services to assess a system’s core QoS mechanisms • Scheduler type • Estimate parameters (both w-c and n-w-c) • General multiple time-scale traffic and service model to characterize a broad set of behaviors within a unified framework
Measuring Service in Multi-Class Networks Aleksandar Kuzmanovic and Edward W. Knightly Rice Networks Group http://www.ece.rice.edu/networks
Ongoing Work • Unknown cross-traffic • Cannot monitor all systems inputs/outputs • Treat cross-traffic statistics as another unknown • Web servers • Evaluation of the framework in a single web server through trace driven simulations • Capacity is statistically characterized
WFQ Parameter Estimation • Class 1: 65-68 flows • Class 2: 25-28 flows • Large windows improve confidence level • T=2sec: 95% in 11% of true value • T=10sec: 95% in 1.4% of true value Flow level dynamics & non- stationarities must be considered
Rate Limited Class State Detection • Can include parameter r in service envelope equations for each class Importance of time scales • Example • Class based fair queuing • C=1.5Mbps, r=1Mbps • Probability decreases with time scale higher errors when measuring multi-level leaky-buckets
Generalized Likelihood Ratio Test • Detection with unknowns • Note: we do not find a single value of that maximizes likelihood ratio • Under mild conditions (as ), GLRT is Uniformly Most Powerful (maximizes the probability of detection)