1 / 27

Modeling Time Correlation in Passive Network Loss Tomography

Modeling Time Correlation in Passive Network Loss Tomography. Jin Cao (Alcatel-Lucent, Bell Labs), Aiyou Chen (Google Inc), Patrick P. C. Lee (CUHK) June 2011. Outline. Motivation Loss model Include correlation Profile likelihood inference Basic approach Extensions Simulation results.

rendor
Download Presentation

Modeling Time Correlation in Passive Network Loss Tomography

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling Time Correlation in Passive Network Loss Tomography Jin Cao (Alcatel-Lucent, Bell Labs), Aiyou Chen (Google Inc), Patrick P. C. Lee (CUHK) June 2011

  2. Outline • Motivation • Loss model • Include correlation • Profile likelihood inference • Basic approach • Extensions • Simulation results

  3. Motivation • Monitoring a network’s health is critical for reliability guarantees • to identify bottlenecks/failures of network elements • to plan resource provisioning • It’s challenging to monitor a large-scale network • Collection of statistics can bring huge overhead • Network loss tomography • compute statistical estimates of internal losses through end-to-end external measurements

  4. Loss Tomography Overview • Active probing • Consider a tree setting. • Send unicast probes to different receivers (leaves) • Collect statistics at receivers • Assume probes may be lost at links • Our goal: infer loss rate of common link (root-to-middle-node link) • Key idea: time correlation of packet losses • neighboring packets likely experience similar loss behavior on the common link probes 4 3 2 1

  5. Passive Loss Tomography • Drawback of active probing: • introduce probing overhead • require collaboration of both senders and receivers • Passive loss tomography: • Monitor underlying traffic • E.g., use TCP data and ACKs to infer losses • Challenges: • Limited control. Time correlation highly varies. • Can we model time correlation?

  6. Prior Work on Loss Tomography • Multicast loss inference [Cáceres et al. ’99, Ziotopolous et al. ’01, Arya et al. ’03] • Send multicast probes • Drawback: require multicast be enabled • Unicast loss inference [Coates & Novak ’00, Harfoush et al. ’00, Duffield et al. ’06] • Send unicast probes to different receivers • Drawback: introduce probing overhead • Passive loss tomography [Tsang et al. ’01, Brosh et al. ’05, Padmanabhan et al. ’03] • Use existing traffic for inference • Drawback: no explicit model of time correlation

  7. Propose passive loss tomography with explicit modeling of time correlation of packet losses Our Objective

  8. Our Contributions • Formulate a loss model as a function of time correlation • Show our loss model is identifiable • Develop a profile-likelihood method for simple and accurate inference • Extend our method for complex topologies • Model and network simulations with R and ns2

  9. Where to Apply Our Work? • An extension for TCP loss inference platform • use packet retransmissions to infer losses • Identify packet pairs: neighboring packets to different leaf branches TCP packets/ACKs Determine information of loss samples TCP packets common link loss samples & packet pairs TCP ACKs Our inference approach … infer loss rate of common link 1 2 K • Note: our work is not on how to sample, but uses existing samples to accurately compute loss rates

  10. Loss Modeling • Main idea: use packet pairs to capture loss correlation • Issues to address: • How to integrate correlation into loss model? • Is the model identifiable? • What is the inference error if we wrongly assume perfect correlation?

  11. V p U p1 p2 1 2 Loss Model • Define: • A packet pair (U, V) to diff. leaves • p, p1, p2 = link success rates • Zu, Zv = success events on common link • ρ(Δ) = correlation(Zu, Zv) with time difference Δ • 0 ≤ ρ(Δ) ≤ 1 (by definition) • ρ(0) = 1 • ρ(Δ) is monotonically decreasing w.r.t. Δ • Probability that both U, V are successfully delivered from root to respective leaf nodes • r11 = p p1 p2 (p + (1 – p) ρ(Δ)) • if ρ(Δ) = 1, r11 = p p1 p2 • if ρ(Δ) = 0, r11 = p2 p1 p2

  12. Modeling Time Correlation • Perfect correlation: ρ(Δ) = 1 • In practice, ρ(Δ) < 1 for Δ > 0 (i.e., decaying) • r11 = p p1 p2 (p + (1 – p) ρ(Δ)) is over-estimated in perfect correlation • Consider two specific approximations: • Linear form: ρ(Δ) = exp(-a Δ) (a is decaying constant) • Quadratic form: ρ(Δ) = exp(-a Δ2) • If Δ is small, good enough approximations to capture time-decaying of correlation • Claim: better than simply assuming perfect correlation

  13. Theorems • Theorem 1: Under the loss correlation model, the link success rates p, p1, p2 and constant a are identifiable, given that ρ(0) = 1 • Theorem 2: If perfect correlation is wrongly assumed in a setting with imperfect correlation, then there is an absolute asymptotic bias. • See proofs in paper.

  14. p p1 pK p2 … 2 1 K Profile Likelihood Inference • Given the loss model, how to estimate loss rate? • Inputs: • single packet end-to-end measurements • packet pair end-to-end measurements • Topology: • Two-level, K-leaf tree • Profile likelihood (PL) inference: • Focus on parameters of interest (i.e., link loss rates to be inferred) • Replace nuisance unknowns with appropriate estimates

  15. Profile Likelihood Inference • Step 1: apply end-to-end success rates • Let Pi = end-to-end success rate to leaf link I • Re-parameterize r11(for every pair of leaves) as a function of p and Pi’s • Solve for {p, P1, P2, …, PK, a} • But this is challenging with many variables to solve Pi = p pi r11 = PU PV p-1(p + (1 – p) ρ(Δ))

  16. ^ Pi = Mi / Ni Profile Likelihood Inference • Step 2: remove nuisance parameters • Based on profile likelihood [Murphy ’00], replace nuisance unknowns with appropriate estimates • Replace Piwith maximum likelihood estimate • Ni = number of packets going to leaf i • Mi = number of total successes to leaf I • Only two variables to solve: p and a

  17. Profile Likelihood Inference • Step 3: estimate p when ρ(.) is unknown • Approximate ρ(.) with either linear or quadratic form • To solve for p and a, we optimize log-likelihood function using BFGS quasi-Newton method • See paper for details

  18. ^ Pi = M / N for all i Extension: Remove Skewness • If some leaf has only a few packets (i.e., Mi, Ni are small), the approximation of Pi will be inaccurate. • Especially when there are many leaf branches • Heuristic: let Pi be the same for all i • Intuition: remove skewness of traffic loads among leaves by taking aggregate average • Let: • N = total number of packets to all leaves • M = total number of successes to all leaves • Take the approximation:

  19. Extension: Large-Scale Topology • If there are many levels in a tree, we decompose into many two-level problems • Estimate loss rates f0 and f1 • f = max(0, (f1 – f0) / (1 – f0))

  20. p p1 pK p2 … 2 1 K Network Simulations • We use model simulations to verify the correctness of our models under ideal settings • See details in paper • Network simulations with ns2: • Traffic models: • Short-lived TCP sessions • Background UDP on-off flows • Loss models: • Links follow exponential ON-OFF loss model • Queue overflow due to UDP bursts • Both loss models are justified in practice and show loss correlation TCP/UDP flows

  21. ^ Pi = Mi / Ni ^ Pi = M / N for all i Network Simulations • Three estimation methods: • est.equal: take aggregate average in end-to-end success rates • est.self: take individual end-to-end success rates • est.perfect: use est.self but assuming perfect correlation

  22. p p1 pK p2 … 2 1 K Experiment 1: ON-OFF Loss • Consider two-level tree, with exponential on-off loss • est.perfect is worst among all p = 2%, pi = 0 p = 2%, pi = 2%

  23. p p1 pK p2 … 2 1 K = 10 Experiment 2: Skewed Traffic • Uneven traffic (let K = 10) • β: % of traffic going to leaves 1 – 5 • 1 – β: % of traffic going to leaves 6 - 10 • est.equal is robust to skewed traffic p = 2%, pi = 0 p = 2%, pi = 2%

  24. Experiment 3: Large Topology • Goal: verify if two-level inference can be extended for multi-level topology

  25. Experiment 3: Large Topology Level 1 Level 2 Level 3 Losses occur only in links of interest

  26. Experiment 3: Large Topology • est.equal is best among all • around 5%, 10%, 20% errors in levels 1, 2, 3 resp. Level 1 Level 2 Level 3 Losses occur only in links of interest

  27. Conclusions • Provide first attempt to explicitly model time correlation in loss tomography • Propose profile likelihood inference • Remove nuisance parameters • Simplify loss inference without compromising accuracy • Conduct extensive model/network simulations • Assuming perfect correlation is not a good idea • est.equal is robust in general, even for skewed traffic loads and large topology

More Related