360 likes | 372 Views
This project presentation discusses the end-host route selection problem in the CHEETAH networking solution, providing both model-based and measurement-based solutions. The goal is to achieve high-speed, rate-guaranteed end-to-end circuits with call-by-call based bandwidth sharing. The presentation concludes with future work and potential improvements.
E N D
Master’s Project Presentation End-host Route Selection in the CHEETAH Networking Solution Zhanxiang Huang 05/01/2006 Advisor: Malathi Veeraraghavan Acknowledgement: This work was carried out under the sponsorship of NSF ITR-0312376, NSF ANI-0335190, NSF ANI-0087487, and DOE DE-FG02-04ER25640 grants.
Outline • CHEETAH project overview • End-host route selection problem • Model-based solution • Measurement-based solution • Conclusion and future work
Circuit-switched High-speed End-to-End Transport ArcHitecture (CHEETAH) Goal: high-speedrate-guaranteed end-to-end circuits with call-by-call-based bandwidth sharing end-to-end connection Telephony Network 64kbps circuits Connectionless Best-effort Internet Congestion Delay Jitter Loss long term leased line (under-utilized & expensive)
CHEETAH Applications Internet Internet • Applications: • video telephony • high speed file transfer • remote visualization especially in eScience community, e.g. Terascale Supernova Initiative (TSI) project
Current CHEETAH Network CUNY high-speed network SN16000 UVA Control card dynamic signaling scheme signaling engine OC192 card GbE card … OC192 card end-host software NCSU NC SN16000 Control card OC-192 signaling engine Cray X-1 OC192 card GbE/10GbE card Atlanta SN16000 ORNL GbE/10GbE card OC192 card GTech Control card OC-192 signaling engine
CHEETAH End-host Software Architecture End-host End-host Internet CHEETAH software CHEETAH software OCS Client OCS Client Routing Decision Routing Decision RSVP-TE Module NICI NICI RSVP-TE Module CHEETAH Network TCP/IP Application TCP/IP Application NICII C-TCP NICII C-TCP • OCS: check Optical Connection Service availability. • Routing Decision: choose between circuit and Internet path for each file transfer. • RSVP-TE Module: dynamic provision of circuits. • C-TCP: transport layer protocol optimized for circuits.
Circuit or Internet Path? • Circuit setup requests may be denied. • It depends on the data transfer delays on the two paths. An extreme example: Transfer a 1K-byte file using TCP. Internet transfer delay is about 100ms. Internet (best-effort path) round trip time=24ms Bottleneck link rate=100Mbps End-host End-host CHEETAH Network (circuit) round trip time=8ms circuit rate=1Gbps setup delay=5 seconds Circuit transfer delay is about 5.1 seconds.
What Determines Data Transfer Delays? • Over paths: • Circuit: • Circuit rate • Round trip time • Setup delay • Internet: • Round trip time • Bottleneck link rate • Packet loss rate • At end-hosts: • Transport layer protocol and parameter settings • OS Process scheduling • Hard disk throughput
How to Estimate Data Transfer Delays? • Model-based solution • Construct mathematical models for computing file transfer delays over the circuit and Internet paths. • Measurement-based solution • Estimate file transfer delays based on delay measurements of past file transfers.
Model-based Solution • Modeling TCP delay over Internet path • TCP Reno delay model [UMass98] • Modeling delay over CHEETAH circuit • Let Pb be the call blocking probability • Average delay over circuit is
Inputs to Delay Models Inputs to TCP Reno delay model: File size Bottleneck link rate Round trip time Packet loss rate Initial congestion window size Sender and receiver buffer sizes Inputs to circuit delay model: File size Circuit rate Round trip time over the circuit path Round trip time over the signaling path Call processing delay at each switch Signaling engine call load Number of switches on the path Call blocking probability
Limitations of the Model-based Solution • Packet loss rate is difficult to measure. (Tools that I tested include Sting, iperf, ping, badabing and etc.) • Same are call blocking probability and signaling engine call load. • Many TCP variants are emerging but there is no delay model for them yet. • e.g. BIC-TCP has been included in linux kernel 2.6 but has not been modeled yet.
Measurement-based Solution delay • Assumptions • Fixed circuit rates, e.g. 1Gbps, 100Mbps… • The number of destinations with which an end-host typically communicates, is not large. • Internet traffic has repeating patterns over time, which means that during a specific time period, round trip time, packet loss rate and call blocking probability are likely the same. Internet circuit file size crossover 0 Internet circuit Idea: Discretize time and file size, at each time slot, for each destination and each circuit rate, measure the delays of file transfers over both paths to find the crossover file size.
Active and Passive Measurements • Active measurements • Traffic is injected into the network explicitly for the purpose of obtaining measurements. • Passive measurements • Data is collected under normal network usage.
A Best-case Active-measurement Experiment Drawback: significant measurement traffic overhead Best-case means packet loss rate and call blocking probability are equal to zero. TCP buffers are set to Bandwidth Delay Product values.
Active Measurements Delays on Internet path and circuit are random variables, DIandDC. • Find an interval (min, max) that contains the crossover file size; • Measure delays on both paths for file size mid=(min+max)/2; • If |E(DI)-E(DC)|<e,crossover=mid; • If E(DI)>E(DC), max=mid; • If E(DI)<E(DC), min=mid; • Go to 2; delay Internet circuit crossover 0 file size max min mid Let M be the initial max file size and N be the initial min file size. Traffic size = O(M*log(M-N)). Drawback: measurement traffic overhead
Passive Measurements • Initiate (min, max) with (0, +inf). • If file size < min, choose Internet; • If file size > max, choose circuit; • If min <= file size <= max, choose each path with probability ½. Record the data transfer delays. • Once there are sufficient records to compute Pr(DI-DC>0) for a file size in (min, max), adjust min or max based onPr(DI-DC>0). crossover p 1 1/2 file size 0 min max (Note that min and max are file sizes in application queries and assume DI and DC follow normal distributions.)
Hybrid Measurements • Fast startup • Find the bottleneck link rate of the Internet path and the circuit setup delay through either passive or active measurement. • Solve the equation for “file_size”. • Init (min, max) with (file_size/2, file_size*2). • Use active measurements when initiated by administrator users.
Interaction Between CHEETAH Software Modules and Applications 1 5 2 4 3 5 6 7
Evaluation • Experiment setup • The Routing Decision server and an application run on a Linux-2.6 box with 2 Xeon 2.8GHz CPUs and 1GB memory. • The application queries with parameters, <128.109.34.22, 1Gbps circuit rate, 1GByte file size, time slot 02:00 Sunday>. The database has an entry corresponding to this IP and time slot. • Internet path: bottleneck link rate=100Mbps; round trip time=24ms.Circuit: round trip time=8ms. • Delay • An application submits 100 queries. • Mean query delay = 0.0055 sec < round trip time << 5 sec (the average setup delay). • Query delay standard deviation = 2.3608e-004 sec < 0.3ms
Conclusion and Future Work • Conclusion • Measurement-based solution is better than the model-based solution. • Adaptive to new TCP variants • Adaptive to the traffic pattern changes • Adaptive to hardware or software configuration changes • Low overhead • Future work • Scalability issues • For a computer that communicates with a large number of end-hosts (e.g. a web server), we can separate the RD module from the computer and run a separate RD server for it. • For computers in the same LAN and with the same hardware and software configurations, we create an RD server for the whole LAN.
Reference [CHEETAH] M. Veeraraghavan, X. Zheng, H. Lee, M. Gardner, W. Feng, CHEETAH: Circuit-switched High-speed End-to-End Transport ArcHitecture, Proc. of Opticomm 2003, Oct. 13-17, 2003. Dallas, TX, Won Best Student Paper Award. [C-TCP] A. P. Mudambi, X. Zheng, and M. Veeraraghavan, A Transport Protocol for Dedicated End-to-End Circuits, accepted by ICC 2006. [UMass98] J. Padhye, V. Firoiu, D. Towsley and J. Kurose. Modeling TCP throughput: A simple model and its empirical validation. In SIGCOMM ’98, September 1998.
How to compute Pr(DI-DC>0)? • Assume the delays observed on the Internet path and the circuit are normally distributed random variables, DI and DC. Each file size has these two random variables. P(DI-DC) E(DI-DC) DI-DC 0
CHEETAH network NYC HOPI Force10 CUNY Foundry UVa 1G UVa Catalyst 4948 WASH HOPI Force10 1G UVa host H H CUNY host CUNY NCSU M20 WASH Abilene T640 2x1G MPLS tunnels NC ORNL Orbitty Compute Nodes 1G Centuar FastIron FESX448 Compute-0-4 152.48.249.6 H 1G Compute-0-3 152.48.249.5 H 1G Force10 E300 switch Compute-0-2 152.48.249.4 1G H 1GFC 1G UCNS X1(E) Compute-0-1 152.48.249.3 H 1G Compute-0-0 152.48.249.2 H 1G Wukong 152.48.249.102 H 3x1G VLAN OC192 OC192 GbE 1G 1G 1G MCNC Catalyst 7600 1-8-33 GbE 10GbE OC192 1G 1-8-34 1-7-33 1-6-1 1-7-1 1G 1G 1-8-35 Zelda4 10.0.0.14 1-7-34 H 1G 1G 1-8-36 1-7-35 Zelda5 10.0.0.15 1-7-1 1-6-1 1G H 1-8-37 1-7-36 1-6-17 1-7-17 1-8-38 1G 1G Wuneng 152.48.249.103 Cheetah-ornl 1-8-39 H cheetah-nc Juniper T320 Atlanta OC-192 lamda Direct fibers GbE OC192 10GbE 1G Zelda1 10.0.0.11 1-7-33 H 1G VLANs Zelda2 10.0.0.12 1-7-34 H 1G Zelda3 10.0.0.13 1-7-35 1-6-1 H MPLS tunnels 1-7-36 1G Juniper T320 1-7-1 1-7-37 2x1G MPLS tunnels 1G 1-7-38 1-6-17 1-7-39 Cheetah-atl By Xuan Zheng, xuan@virginia.edu
Acronym • CHEETAH – Circuit-switched High-speed End-to-End Transport ArcHitecture • PLR – Packet Loss Rate • SD – Setup/Teardown Delay • RTT – Round Trip Time • AB – Available Bandwidth • GMPLS – Generalized Multiple Protocol Label Switching • SONET – Synchronous Optical NETwork • SDH – Synchronous Digital Hierarchy