210 likes | 221 Views
This paper explores inferring network topology based on external measurements in network tomography. It presents a unique approach using sole host-based unicast measurements that measure correlation between receivers. The study introduces the Sandwich Probing novelty and stochastic search methods for topology identification, along with likelihood formulation, Maximum Likelihood Tree, and a Bayesian approach to learning optimal trees. The research also includes simulation results and practical implications for network tomography.
E N D
Maximum Likelihood Network Topology Identification Mark Coates McGill University Robert Nowak Rui Castro Rice University DYNAMICS May 5th,2003
Network Tomography • Inferring network topology based on “external” end-to-end measurements. • Traceroute requires cooperation of routers:May not be met in practice • This paper assumes no internal network cooperation • Solely host-based unicast measurements
How does it work? The Problem Statement Unique Sender R
How does it work? Information we have • End-to-end measurements that measure the degree of correlation between receivers • Associate metric i,j with pair of receivers i,j R Monotonicity property: pi,pj,pk : Paths from sender to i,j,k If pi shares more links with pj than with pk, then i,j > i,k
An example Here 18,19 > i,19for all other i Examples ? Simple Bottom-up merging algorithms can be used to identify full, logical topology
Two-fold Contribution • Novel measurement scheme: • Sandwich Probing • Each probe: three packets • Main Idea: Small packets queues behind the large, inducing extra seperation between small packets on shared links • A stochastic search method for topology identification
Sandwich Probing p2 p1 no cross-traffic: 01: queuing delay of p2 on link 01, 35= 01 ij: sum of ’s on the shared links to receiver i and j
Sandwich Probing 34= 01+12 35= 01 more shared queues larger g
Advantages over loss and delay based metrics • Probe loss is rare on Internet. Large number of measurements required • For measuring delay, clock sync required • Each measurement contributes here.
Multiple measurements CLT Measurement framework Measurement of ij contaminated by cross traffic Cross traffic: zero-mean effect on
Likelihood Formulation • Estimated metrics are randomly distributed according to density p • p parameterized by underlying topology T and set of true metric values • When is viewed as function of T and , it is called the likelihood of T and .
Likelihood Formulation • Maximum Likelihood Tree is given by: F denotes forest of all possible trees G denotes set of all metrics satisfying monotonicity property Maximization involved is formidable Brute Force method: for N = 10, more than 1.8 x 106 trees
Simplifying the problem • Parameters are chosen to maximize the value for a given tree T • To provide the very best fit T can provide to Data • Log likelihood of T Maximum Likelihood Tree is the one in the forest that has the largest likelihood value
Stochastic Search • Reversible Markov Chain Monte Carlo Method • Using above techniques, authors devise a rapid search method to find optimal trees. • “Learning using Bayesian Statistics” • Prior and Posterior distributions Main Idea: Posterior Distribution gives the region of high likelihood trees in F
Birth Move (insert node) T T 1 2
T T 2 1 Death Move (delete node)
ns-2 Simulations source 9 8 1 7 6 5 2 3 4
Simulation results % Correct 100 MPLT 80 60 DBT 40 20 4000 6000 8000 Number of Probes
MCMC Algorithm true topology MCMC topology Can Layer 2 branching points High speed connections can fool tomography
Summary • Delay-based measurement, no need for clock synchronization • MCMC algorithm to explore forest and identify maximum (penalized) likelihood tree