50 likes | 54 Views
Factor Analysis of Network Flow Throughput Measurements for Inferring Congestion Sharing. Dogu Arifler and Brian L. Evans Eastern Mediterranean University - The University of Texas at Austin European Signal Processing Conference Antalya, Turkey, September 4-8, 2005. http://www.emu.edu.tr.
E N D
Factor Analysis of Network Flow Throughput Measurements for Inferring Congestion Sharing Dogu Arifler and Brian L. Evans Eastern Mediterranean University - The University of Texas at Austin European Signal Processing Conference Antalya, Turkey, September 4-8, 2005 http://www.emu.edu.tr http://www.ece.utexas.edu
Inference of congested path sharing • Motivation: Network managers need information about resource sharing in other networks to better plan for services and diagnose performance problems • Internet service providers need todiagnose configuration errors and link failures in peer networks • Content providers need to balance workload and plan cache placement • Problem: In general, properties of networks outside one’s administrative domain are unknown • Little or no information on routing, topology, or link utilizations • Solution: Network tomography • Inferring characteristics of networks from available network traffic measurements
available capacity TCP flow 1 TCP flow 2 time time overlap Autoregressive model for available capacity Duration of f1=20 Throughput Correlation • Throughputs of TCP flows that temporally overlap at a congested resource are correlated • Removing large- and small-sized flows helps in capturing positive throughput correlations due to resource sharing high correlation for temporally overlapping flows Start time of f2
Measured data: component variances • Use 4 flow classes: AOL1, AOL2, HotMail1, and Hotmail2 • Filter flow records based on • Packets: Discard flows consisting of only 1 packet • Duration: Discard flows with duration shorter than 1 second • Size: Discard flows with sizes < 8 kB or > 64 kB • Normalized component variances: • 2 significant components with explanatory power of 72% for Dataset2002 and 63% for Dataset2004
Measured data: factor analysis • Based on 2 significant components, determine factor loadings • Rotated factor loading estimates: • Rows correspond to classes • Columns correspond to shared infrastructure • Estimate 95% bootstrap confidence intervals for loadings to establish accuracy† • With 95% confidence, we can identify which flow classes share infrastructure! Dataset2002 Dataset2004 AOL1 AOL2 HotMail1 Hotmail2 AOL1 AOL2 HotMail1 Hotmail2 † D. Arifler, Network Tomography Based on Flow Level Measurements, Ph.D. Dissertation, 2004.