530 likes | 714 Views
Time Series from their Observed Sums: Network Tomography. Edoardo M. Airoldi School of Computer Science Carnegie Mellon University (joint work with Christos Faloutsos). SIGKDD, Seattle, WA August 23 nd 2004. Acknowledgements. Srinivasan Seshan, CSD, CMU
E N D
Time Series from their Observed Sums: Network Tomography Edoardo M. Airoldi School of Computer Science Carnegie Mellon University (joint work with Christos Faloutsos) SIGKDD, Seattle, WA August 23nd 2004
Acknowledgements • Srinivasan Seshan, CSD, CMU • Russel Yount and Frank Kietzke, Network Development, CMU • Stephen Fienberg, Statistics, CMU • Jin Cao, Bell Labs • Claudia Tebaldi, NCAR • Yin Zhang, AT&T Labs
Outline • Introduction / Motivation • Survey • Proposed Methods • Results • Conclusions
Application Domains • Communication Networks • goal: Who is sending to whom • refs: Cao et al (2001), Liang & Yu (2003), Zhang et al (2004) • Transportation Networks • goal: Who is going where • Network Probing (Rish et al, IBM) • goal: Which server is down • refs: Rish et al (2002, 2004)
B C A Communication Networks • A large ISP network has 100s of nodes, 1000s of links, 10000s routes, and over 1 petabyte (1015 bytes) per day OD flows • Reliability analysis • Predict link loads under unexpected/planned router/link failures • Traffic engineering • Optimize routes to minimize congestion • Capacity planning • Forecast future capacity requirements link loads
Link Flows Routing Matrix A OD Flows Mathematical Formulation X1 X X2 LINK Y X3 X4 Situation at time = t One Constraint: Total iYi = 0 = =
Problem Definition Given: topology, fixed routing scheme A[nxm], traffic on the links of the network Y(t)=[Y1(t), …, Yn(t)] over time t = 1, …, T Find: non-observable traffic between origin-destination (OD) pairs X(t)=[X1(t), …, Xm(t)] over time t = 1, …, T. Y(t) = A·X(t) Under-constrained
A Glance at the Data Find OD Flows X(t) X1(t1) X2(t1) X3(t1) X4(t1) X1(t2) X2(t2) X3(t2) X4(t2) X1(t3) X2(t3) X3(t3) X4(t3) X1(t4) X2(t4) X3(t4) X4(t4) ? Time Kb Y1(t1) Y2(t1) Y3(t1) Y1(t2) Y2(t2) Y3(t2) Y1(t3) Y2(t3) Y3(t3) Y1(t4) Y2(t4) Y3(t4) Measure Link Flows Y(t) hour of the day
Our Problem: No Traffic Matrix • Traffic matrix • Gives traffic volumes between origin and destination • Very difficult to directly measure • Direct measurement [Feldmann et al. 2000] • Collect flow-level data around the whole edge of the network • Combine with routing data • Semi-standard router feature: Netflow • Cisco, Juniper, etc. • Not always well supported • Potential performance impact on routers • Huge amount of data (500GB/day) • Widely available SNMP data gives only link loads • Even this data is not perfect (glitches, loss, …)
Outline • Introduction / Motivation • Survey • Proposed Methods • Results • Conclusions
139 1 167 29 OD OD 9 32 37 4 Links Links Infinite Exact Solutions Measurements (Yt) and routing scheme A[3x4] allow for many feasible OD flows (Yt) For example: The problem is under-constrained and we need some assumptions
Estimated OD Kb hour of the day hour of the day Related Work • Solutions in the past • Direct solution: SVD • Scoring criterion: GLS, maximum likelihood, entropy, Bayesian methods, … • Regularization: assume independent OD flows • Estimate OD flows xt using { yt-, … yt+ } y = Ax
Pitfalls of Past Approaches • Unrealistic Models: Gaussian or Poisson OD traffic flows. But we observe bursty, log-Normal traffic flows. • Time Dependence across Epochs: Never explicitly addressed, and typically assume xt independent over time. But we observe time dependence of single OD flows.
Empirical Laws: log-Normality • Aggregate OD flows look log-log Normal Counts Counts Log Bytes Log-Log Bytes [ 12321 OD time series. CMU validation data. ]
Outline • Introduction / Motivation • Survey • Proposed Method • 1st Stage - Linear Dynamical Systems • 2nd Stage - Bayesian Dynamical Systems • Results • Conclusions
The Model • A smooth average process { t : t > 0 } • A possibly bursty process { xt : t > 0 } to model the OD traffic flows
Parameter Estimation • Estimate parameters underlying the average process { t : t > 0 } • Calibrate priors for the parameters driving the dynamic of the OD flows process { xt : t > 0 } • Estimate the OD flows using a Particle Filter
Outline • Introduction / Motivation • Survey • Proposed Method • 1st Stage - Linear Dynamical Systems • 2nd Stage - Bayesian Dynamical Systems • Results • Conclusions
Introducing Time Dependence • We introduce explicit time dependence: (t) = F[nxn] (t-1) + e(t) • The distinct OD flows, components of (t), are assumed to be independent • Use EM algorithm
Introducing Time Dependence • Our Linear Dynamical System contains the models by Cao et al. as a special case
Outline • Introduction / Motivation • Survey • Proposed Method • 1st Stage - Linear Dynamical Systems • 2nd Stage - Bayesian Dynamical Systems • Results • Conclusions
Bayesian Dynamical System • Gamma and log-Normal OD flows (Xt) • Use preliminary estimates of { t : t > 0 }, the average OD flows, to softly constrain the dynamical behavior of the OD flows to identify the correct solution for Xt
Non-Deterministic Dynamics • Introduce explicit non-deterministic dynamics (F) on the average OD flows: ’(t+1) = F’[nxn] · ’(t) • Diagonal matrix F’[nxn] : F’[i,i] ~ log-Normal
Learning Latent Dynamics We want a preliminary estimate for Ft in: t+1 = Ft+1 t ? P(247|Y247) P(246|Y246) Solve for F247
Outline • Introduction / Motivation • Survey • Proposed Methods • Results • Datasets • Importance of Time Dependence • Importance of non-Gaussianity • Informative Priors for non-Gaussian BDS • Conclusions
Validation Data sets • Consider star network topologies [ 4 OD flows, 9 OD flows and 16 OD flows ] • Carnegie Mellon [ 12321 time series ] • Lucent Technologies [ 32 time series ] X1 X X2 LINK Y X3 X4 Situation at time = t
Log-Normal OD Traffic Flows • The validation OD traffic flows are skewed on both data sets
Outline • Introduction / Motivation • Survey • Proposed Methods • Results • Datasets • Importance of Time Dependence • Importance of non-Gaussianity • Informative Priors for non-Gaussian BDS • Conclusions
Reduce Variability • Narrower range of possible values for the OD traffic flows: those which receive positive posterior probability
Robust Estimates • Capture sharp changes in the distribution of the OD traffic flows
Outline • Introduction / Motivation • Survey • Proposed Methods • Results • Datasets • Importance of Time Dependence • Importance of non-Gaussianity • Informative Priors for non-Gaussian BDS • Conclusions
Capture Several Bursts Kb time
Outline • Introduction / Motivation • Survey • Proposed Methods • Results • Datasets • Importance of Time Dependence • Importance of non-Gaussianity • Informative Priors for non-Gaussian BDS • Conclusions
True values Priors and Bayesian inference • Informative Priors on { t : t > 0 } lead to uni-modal posteriors
Speed and Scalability • The computing is time about 3 minutes [ 4 OD - 3 Links using R on Mac G4 667 ] • Linear in (#OD) for each time point 1 day worth of data in 45 minutes
Outline • Introduction / Motivation • Survey • Proposed Methods • Results • Conclusions
Past Approaches • Unreasonable Models: Gaussian or Poisson arrivals • Time Dependence: never explicitly addressed
Conclusions • Log-Normal models account for skewed and bursty, non-observable OD flows • Novel BDS captures time dependence of data thus reducing the variability of the estimates • Informative priors serve as soft constraints to overcome the under-determinacy of the problem
Future Work • More tests on bigger networks • from 2-star (4-D) to 4-star (16-D) • Fit non-parametric seasonal components for the non-observable OD flows
Network Engineering • State-of-the-Art: guess and tweak • Guess based on experience & intuition • Manually tweak things, and hope the best • Disadvantages • Manual process: time consuming, error prone • Not very reliable: intuition may be wrong, unexpected side effects • Suboptimal performance: wastes resource/time • Need to repeat the exercise when traffic pattern changes
Feldmann et al. 2000 • Shaikh et al. 2002 Tomography • Fortz et al. 2002 A More Scientific Approach? A:"Well, we don't know the topology, we don't know the traffic matrix, the routers don't automatically adapt the routes to the traffic, and we don't know how to optimize the routing configuration. But, other than that, we're all set!" [Rexford2000, Kurose2003]
Contributions • Realistic Models: Gamma and log-Normal P( OD Flows(t) | (t) ) • Explicit Time Dependence: E( OD Flows(t) | y(t) … y(1) )
Contributions • Informative priors in a Bayesian Dynamical System for an under-constrained problem • Drive our inferences to the correct solution • Get high quality particles • Easy solution for Sparse Traffic
Exploring the OD space • Gibbs sampler with Metropolis steps is able to explore P(Xt| Yt) • We prove irreducibility of the chains [ Gamma, log-Normal ] P(Xt|Yt) > 0 P(Xt|Yt) = 0 P(Xt|Yt) > 0
Non-Deterministic Dynamics • Introduce explicit non-deterministic dynamics (F) on the average OD flows: ’(t+1) = F’[nxn] · ’(t) • Diagonal matrix F’[nxn] : F’[i,i] ~ log-Normal leads to: ’(t+1) = F’·’(t) e(t+1) = eF·e(t) (t+1) = F+(t)
Better OD Flows in 4 Steps 1 4 2 3
Immanuel Kant + o(1) In making inferences on non-observable quantities we find the model we look for! Assume a model that reasonably approximates real OD flows, and of course it does not hurt to have a prior opinion about it …