540 likes | 554 Views
This presentation by Joe and Wenjie Jiang introduces the concept of tomography and link delay inference using the EM algorithm. It covers topics such as internal link delay inference, basic EM, and includes a simple example. The talk also discusses the motivation behind tomography and its importance in network analysis.
E N D
A General Introduction to Tomography & Link Delay Inference with EM Algorithm Presented by Joe, Wenjie Jiang 21/02/2004
Outline of Talk • Why tomography? • Introduction to tomography • Internal Link Delay Inference • Basic EM • A simple example to infer internal link delay using EM algorithm • Conclusion
Terminology “Tomography” Brain Tomography Access is difficult! Network Tomography Access is difficult! Vardi 1996
Why tomography? What is the: • Bandwidth? • Loss rate? • Link Delay? • Traffic demands? • Connectivity of links in the network? (Topology Inference) Path: a connection between two end nodes, each consisting of several links. Link: a direct connection with no intermediate routes/hosts.
Motivation • Identify congestion points and performance bottlenecks • Dynamic routing • Optimized service providing • Security: detection of anomalous/malicious behavior • Capacity planning
Why tomography - Difficulty • Decentralized, heterogeneous and unregulated nature of the internal network. • No incentive for individuals to collect and distribute these info freely. • Collecting all statistics impose an impracticable overhead expense • ISP regards the statistics highly confidential • Relaying measurements to decision-making point consumes bandwidth.
Why tomography - Solution • Widespread internal network monitoring is expensive and infeasible • Edge-based measurement and statistical analysis is practical and scalable
Where are you? • Why tomography? • Introduction to tomography • Internal Link Delay Inference • Basic EM • A simple example to infer internal link delay using EM algorithm • Conclusion
Introduction to tomography • Use a limited number of measurements to infer network (link) performance parameters, using: -- Maximum Likelihood Estimator -- Estimation Maximization -- Bayesian Inference and assuming a prior model. • Categories of problems: -- Link level parameter estimation -- Sender-Receiver traffic intensity. -- Topology Inference
Introduction to tomography (2) • Two forms of network tomography: -- link-level metric estimation based on end-to-end, traffic measurements (counts of sent/received packets, time delays between sent/received packets) -- path level (sender-receiver path) traffic intensity estimation based on link-level measurements (counts of packets through nodes) • Passive or Active measurements? • Multicast or Unicast?
Problem Description • To solve the linear system: • A, ө and εhave special structures. • Goal: to maximize the likelihood function
Problem Description (2) • A = routing matrix (graph) • ө = packet queuing delays for each link • y = packet delays measured at the edge • ε= noise, inherent randomness in traffic measurements Statistical likelihood function
Problem Description (3) l1 l2 l3 l4 l5 l6 l7 l1 l2 l3 l4 l5 l6 l7 Y1 Y2 Y3 Y4 An virtual multicast tree with four receivers Y1=X1+X2+X4
Where are you? • Why tomography? • Introduction to tomography • Internal Link Delay Inference • Basic EM • A simple example to infer internal link delay using EM algorithm • Conclusion
Physical Topology Measure end-to-end (from sender to receiver) delays
Logical Topology Logical topology is formed by considering only the branching points in the physical topology Infer the logical link-level queuing delay distributions!
The basic idea of internal link delay tomography Send a back-to-back packet pair from a sender, each packet heading to a different receiver Use the fact that delays are highly correlated on shared links Queuing delay difference between these two end can be attributed to the unshared links
Delay Estimation • Measure end-to-end delay of packet pairs Packets experience the same delay on link1 d2=dmin=0 d3>0 Extra delay on link 3!
Packet-pair measurements • Key Assumptions • Fixed known routes • Temporal independence • Spatial independence • Packet-pair delays are identical on share links. N delay measurements in all
Parameters αi = parameter of delay pmf on link i α1 α3 α2 α6 α4 α5 α7 α9 α8
Link delay model • αi = delay pmf on link i • Link delay model could be multinomial • quantized delay model: delay= {0, 1, 2, 3,…,L,∞} • αi= {αi0,αi1,αi2,...,αiL,αi ∞} • αij=P{ delay(link i) = j } • αi0+αi1+αi2,...,αiL+αi ∞=1
Goal is the probability of the event of n-th measurement is the probability of the event of all measurements Our goal: find
Where are you? • Why tomography? • Introduction to tomography • Internal Link Delay Inference • Basic EM • A simple example to infer internal link delay using EM algorithm • Conclusion
Review of MLE (Maximum Likelihood Estimation) • The basic idea of MLE: God always let the event with the biggest probability happen the most likely -- The MLE of ө is to make the sample occur the most likely • Note we assume X={x1,…xN} to be i.i.d • The solution could be easy or hard depending on the form of p(ө|X) • e.g. p(ө|X) is a single Gaussian ө=(μ, σ2), we can set the derivative of logL(ө|X) to zero and solve it directly.
Complete Data • The sample X={x1,…xN} together with the missing (or latent) data Y is called complete data. • The complete likelihood is where p(x, y|ө) is the joint density of X and Y given the parameter ө. • The complete log-likelihood is
Complete MLE • By the definition of conditional density, where p(y|x,ө) is the conditional density of Y given X=x and ө • The complete MLE
Basic idea of EM • Given X=x and ө= өt-1, where өt-1 is the current estimates the unknown parameters • log p(x,Y| ө) is a function of Y whose unique best Mean Squared Error (MSE) predicator is
The magic of EM • the direct MLE of is relatively hard to solve • But the MLE of complete log-likelihood is relatively easier to obtain • since is a function of x and y, (y is hidden), we use the expectation of y under x and • So E-step M-step
Where are you? • Why tomography? • Introduction to tomography • Internal Link Delay Inference • Basic EM • A simple example to infer internal link delay using EM algorithm • Conclusion
EM in link delay inference Note that here notation x and y have opposite meaning of x, y stated in previous EM algorithm α1 x1 x2 x3 α3 α2 x6 α6 x4 x5 x7 x9 α4 α5 α7 x8 α9 α8
EM in link delay inference (2) • Complete data Z=(X,Y) • the complete data log-likelihood: • Pα[Y|X] has nothing to do with α • mi,j is the total number of packets experience a delay j on link i over N measurements.
EM in link delay inference (3) The MLE of αwould be
EM in link delay inference (4) MLE which is the frequency of event mi A simple example is that we toss a die, P( the result i)=αi (i=1,2…6) mi= how many times we see result i
EM in link delay inference (5) • We notice that is similar to only different that should be replaced by • So the MLE
EM in link delay inference (6) Probability Propagation
A simple example 0 delay on each link fall into {0,1,2,3} x1 1 x2 x3 2 3 αij=P{ delay (link i) = j } y2 y1
A simple example (2) Suppose there are 5 measurements: { (3,2), (4,2), (6,5), (0,0), (4,1)} 0 x1 1 x2 x3 2 3 y2 y1
A simple example (3) 0 x1 1 Bayes Formula x2 x3 2 3 y2 y1
A simple example (4) 0 x1 1 x2 x3 2 3 y2 y1
A simple example (5) 0 x1 similarly: 1 x2 x3 2 3 y2 y1
A simple example (6) mi,j computed in the first iteration.
A simple example (7) the physical meaning of α1,0is that: the number of packets that experience delay 0 on link i divided by the total number of packets that travel through link i
A simple example (8) αi,j computed in the first iteration
A simple example (9) Iteration: iterate E-step and M-step, until some termination criteria is satisfied! After 6 iterations, αi,j converges to a fixed value.
A simple example (9) { (3,2), (4,2), (6,5), (0,0), (4,1)} 0 x1 1 x2 x3 2 3 y2 y1