710 likes | 858 Views
Transport and Application Layer Approaches to Improve End-to-end Performance in the Internet. PhD thesis defense Amit Mondal Committee : Aleksandar Kuzmanovic, Asst. Professor, Northwestern Univ Peter Dinda , Assoc. Professor, Northwestern Univ
E N D
Transport and Application Layer Approaches to Improve End-to-end Performance in the Internet PhD thesis defense Amit Mondal Committee: Aleksandar Kuzmanovic, Asst. Professor, Northwestern Univ Peter Dinda, Assoc. Professor, Northwestern Univ Yan Chen, Assoc. Professor, Northwestern Univ Jin Li, Principal Researcher, Microsoft Research
Internet — A multiservice IP network VoIP FTP IPTV The Internet is a commercial infrastructure used by diverse set of applications and services Video Conferencing Streaming Gaming
Challenges involved… • Applications have end-to-end network performance requirements • Jitter, latency, packet loss, bandwidth, etc • Original Internet • Best effort service • No service assurance • TCP ensures only in-order packet delivery • Destination-based IP routing “high throughput” “low delay” Need to provide support to new set of emerging applications in the Internet
Application classification based on QoS My focus: • Low-latency interactive TCP applications (Chapter II and III) • Telnet, SSH, network games, e-commerce, etc. • Interactive multimedia services (Chapter IV and V) • Audio/video conferencing, VoIP, streamed multimedia services, etc.
The spectrum of QoS provisioning Infrastr Endpoint Physical Network Data Link Transport Application
Research thesis • For example, I propose techniques that improve • Response times of short TCP flows by five times in certain scenarios • Median Mean Opinion Score (MOS) of VoIP calls over WiFi by a factor of two Despite much work to improve end-to-end performance in the Internet, there still exists a significant space for improvement. In my dissertation, I develop techniques to reduce the gap further.
Outline • Chapter I: Introduction • Chapter II: Improving performance of thin-stream TCP applications • Chapter III: Removing exponential backoff from TCP • Chapter IV: Multi-constraint QoS routing framework • Chapter V: Audio/video performance Issues: Diagnosis and solutions • Conclusion
Chapter II: Improving thin-stream TCP flows Upgrading mice to elephants data packets strict priority TCP-fair rate “dummy” packets Packet switched Circuit switched • A. Mondal and A. Kuzmanovic, “When TCP Friendliness Becomes Harmful”, IEEE INFOCOM 2007 • A. Mondal and A. Kuzmanovic, “Upgrading Mice to Elephants: Effects and End-Point Solutions”, IEEE/ACM Transactions on Networking, Volume 18, Issue 2, April 2010
Chapter III: Removing Exponential Backoff from TCP • V. Jacobson, “Congestion Avoidance and Control,” in ACM CCR, 18(4): 314-329, Aug 1988. • Exponential retransmit timer backoff • Implicit packet conservation principle • Response times improvement of short and interactive flows by five times in certain scenarios • A. Mondal and A. Kuzmanovic, “Removing Exponential Backoff from TCP”, In ACM SIGCOMM CCR, Volume 38, Number 5, October 2008.
Chapter IV: Multi-constraint QoS routing framework • We design a framework that finds path under multiple constraints without NP-hard computation • Dijkstra’s algorithm involves NP-hard computation • Hybrid protocol of path vector protocol and on-demand route discovery • Using simulation based on real-world data we demonstrated that our solution is both efficient and scalable • Built a functional prototype using Click Modular router • A. Mondal, P. Sharma, S. Banerjee, and A. Kuzmanovic, “Supporting Application Network Flows with Multiple QoS Constraints”, In IEEE IWQoS 2009
Chapter V: Audio/video performance issues: Diagnosis and solutions • Identify challenges towards high quality audio/video conferencing over the Internet • Understand loss and jitter behavior in shorter time scale and quantify impacts of various network scenarios • Investigate solutions • A. Mondal, R. Cutler, C. Huang, J. Li, and A. Kuzmanovic, “SureCall: Towards Glitch-Free Real-time Audio/Video Conferencing”, In IEEE IWQoS 2010 • A. Mondal, C. Huang, M. Jain, J. Li, and A. Kuzmanovic, “A Case of WiFi Relay: Improving VoIP Quality for WiFi Users”, In IEEE ICC 2010
SureCall platform • A distributed measurement and experiment platform • Understand problems and experiment solutions • Agents installed on volunteers’ machines • Measurements and experiments driven by masters • SureCall agents are upgradeable without user intervention • Available from http://research.microsoft.com/~chengh/SureCall/SureCall.htm
SureCall measurement • Emulated bidirectional audio/video sessions using UDP • 5 minute per hour • Audio bitrate : 24 kbps • Video bitrate: 192 kbps • STUN NAT traversal protocol for home users • Detailed packet-level traces collected • Network connectivity close to the clients • ICMP packet pair with TTL=2 • Traceroute to other endpoint at the beginning and end of each session • Environmental details on client machines • CPU load, network interface type
SureCall deployment • Microsoft global enterprise network • Many residential networks • Current deployment status • 80 unique machines • Enterprise - 32 • Home – 20 • Both – 28 • Enterprise trace and Home trace • Two separate masters (within enterprise network and in public Internet)
SureCall dataset • 4,800 hours of packet traces • 4,100 from enterprise • 700 from home • 1968 unique IP addresses • Enterprise - 1212 • Home -756 • Trace classification and stratification • Intra-continental vs inter-continental • Wired vs wireless • Audio-only vsaudio+video • Trace preprocessing • Clock skew removal Clock skew in wild
Jitter computation algorithm • Multiple algorithms to compute jitter • Variance of one-way-delay samples • Time difference between actual packet receiving time and ideal receiving time • Most relevant for multimedia streaming/conferencing with playout buffer
Jitter in enterprise and residential networks US-US, wired traces Inter-continental, wired traces Residential networks have significantly higher jitter compared to enterprise networks and affected greatly by inter-continental links.
Jitter variation across hosts Home Enterprise Jitter variation is much higher in residential networks than in enterprise networks. The 95-th percentile jitter values are significantly worse than median jitter values in home networks.
Packet loss in residential and enterprise networks Even well provisioned enterprise networks can become quite congested in short time scale. Both enterprise and home networks show long tail in loss burst size distribution.
Impact of WiFi connections In both enterprise and home networks, wireless traces show significantly worse jitter statistics than wired traces. Home Enterprise
Impact of WiFi connections In both enterprise and home networks, wireless traces show significantly worse jitter statistics than wired traces. The degradation due to WiFi in enterprise scenarios is more severe than that in home scenarios. Home Enterprise
Impact of VPN on performance Jitter Loss VPN connection causes more degradation compared to wireless.
Can jitter predict future loss events? • Extent to which loss and jitter are correlated, i.e. whether abrupt jitter increase can serve as a precursor of network congestion and predict future loss events • audio/video conferencing applications can take anticipatory action. • > 10 ms average increase in end-to-end delay for the last three packets preceding a loss event • enterprise networks ~ 82% , • home networks ~ 80%
Correlation between loss burst size and jitter Home Enterprise End-to-end delay increases significantly before loss events in both enterprise and home networks. Increase in end-to-end delay is not a great indicator of loss burst size in enterprise networks.
Network audio diagnostics • Concealed: percent of packets interpolated or extrapolated due to unrecovered packet loss • Stretched: percent of packets stretched via time compression • Classifier operates as follows • Supervised training with ground-truth objectively determined by PESQ score
Audio classifier performance The classifier achieves a true positive rate >80% and false positive rate < 1% for T1=T2=0.07.
WiFi Relay: Improving VoIP Quality for WiFi Users • Large number of WiFi clients both in enterprise and residential networks • 43% enterprises provide only WiFi connections to their employees • 36% uses VoIP over WiFi • Possible reasons • dense deployment of APs, overloading of an AP point, other wireless devices in the vicinity, etc WiFi links can significantly degrade VoIP performance
Effectiveness of redundancy • Passive analysis with voice packet replication • Replication ratio r = 2,3,4, or 5 Packet losses can be effectively mitigated using application layer packet replication
Overhead of replication • Typical audio packet size = 60 bytes • Encapsulated with RTP(12bytes), UDP (8bytes), IP(20bytes), 802.11 MAC(28bytes), PHY (20us for 802.11g) headers. • w/o ACK: air time = DIFS + PHY header + (60+76 bytes)/54Mbps = 70 us Replicating audio packet at application layer causes only marginal increase in air time
WiFi relay solution • Nearby wired endpoints as relays • Heavy replication between relays and wireless endpoints • No dedicated infrastructure
Evaluation • Evaluated on SureCall platform • Upgrade SureCall clients to support relay • Simultaneous direct call and relayed VoIP calls between each pair of SureCall agents • Apple-to-apple comparison • One-hop overlay (only one wireless endpoint) • Two-hop overlay (both endpoints are wireless) • Relay node selection based on enterprise internal database
Impact of relay on jitter • No dedicated infrastructure, ordinary endpoints as relay nodes Relay has negligible impact on end-to-end jitter CDF of jitter diff at 95th percentile CDF of jitter diff at 50th percentile
Improvement with WiFi relay • Mean Opinion Score (MOS) • Calculated from packet loss rate and jitter (Cole et al. CCR’01) • Fixed de-jitter buffer of 100 ms WiFi relay greatly reduces packet loss WiFi relay significantly improve VoIP quality for WiFi users
Summary of Chapter V • SureCall, a distributed experimental platform, to address the challenges of audio/video communications over Internet. • Characterized enterprise and residential networks over a wide variety of network scenarios • Classifier that accurately predicts when network issues most likely to cause audio quality degradation • WiFi relay that significantly improve VoIP qualify for WiFi clients
Conclusion • Proposed easily deployable techniques to improve performance of TCP based interactive applications • Demonstrated that exponential backoff can be altogether removed from TCP without any stability issues • Designed an overlay framework to support multimedia services with multiple QoS constraints • Developed an distributed experimental framework, SureCall, to understand the challenges towards IP based audio/video communications and for rapid evaluation of new protocols
Publications [1] A. Mondal and A. Kuzmanovic, “When TCP Friendliness Becomes Harmful”, In IEEE INFOCOM 2007 [2] A. Mondal and A. Kuzmanovic , “A Poisoning-Resilient TCP Stack”, In IEEE ICNP 2007 [3] A. Mondal and A. Kuzmanovic , “Removing Exponential Backoff from TCP”, In ACM SIGCOMM CCR, Volume 38, Number 5, October 2008. [4] A. Mondal, P. Sharma, S. Banerjee, and A. Kuzmanovic, “Supporting Application Network Flows with Multiple QoS Constraints”, In IEEE IWQoS 2009 [5] A. Kuzmanovic, A Mondal, S. Floyd, and K.K. Ramakrishnan. “Adding Explicit Congestion Notification (ECN) Capabilities to TCP’s SYN/ACK Packets”. RFC 5562, June 2009. [6] A. Mondal and A. Kuzmanovic, “Upgrading Mice to Elephants: Effects and End-Point Solutions”, In IEEE/ACM Transactions on Networking, Volume 18, Issue 2, April 2010 [7] A. Mondal, R. Cutler, C. Huang, J. Li, and A. Kuzmanovic, “SureCall: Towards Glitch-Free Real-time Audio/Video Conferencing”, In IEEE IWQoS 2010 [8] A. Mondal, C. Huang, M. Jain, J. Li, and A. Kuzmanovic, “A Case of WiFi Relay: Improving VoIP Quality for WiFi Users”, In IEEE ICC 2010 [9] A. Mondal, I. Trestian, Z. Quin, and A. Kuzmanovic, “P2P as CDN (Akamizing BitTorrent)”, under submission [10] J. Miller, A. Mondal, R. Potharaju, P Dinda, and A. Kuzmanovic, “Network Monitoring is People: Understanding End-user Perception of Network Problems”, Under submission.
QoS and the Internet • QoS Architectures • Integrated Service (Intserv) • Differentiated Service (Diffserv) • Multi Protocol Label Switching (MPLS) • Traffic Engineering and Constraint based routing • Key Challenges • Scalability issues in core • Complex signaling protocols • Deployment overhead • Current Internet still offers only a best-effort service • Motivates to investigate easily deployable solutions that improve end-to-end network performance
QoS using transport and application layer techniques without network support • Explicit congestion notification [ Floyd 94] • Packet marking and differential dropping [Guo and Matta’01] • Limited transmit [Allman et al. 01] • Service differentiation [Neoreddine and Tobagi’02] • Differential congestion notification [Le et al.’04] • TCP smart framing [Mellia et al. ‘05] • ECN+ [Kuzmanovic’05] • Early retransmit [Allman et al.’06] • TCP SAReno[Yang and Vecinia’02] • PCP [Anderson et al. ‘06]
Going beyond TCP-fair • Differentiated minRTO • Application-limited flows use reduced minRTO value • Short-term padding with dummy packets • Application data followed by three tiny dummy packets • Diversity approach • Application layer FEC-based approach • The simplest FEC scheme is replication
Why Exponential Backoff? • Jacobson adopted exponential backoff from the classical shared-medium Ethernet protocol • “IP gateway has essentially the same behavior as Ether in a shared-medium network.”
C Why Exponential Backoff? • Jacobson adopted exponential backoff from the classical shared-medium Ethernet protocol • “IP gateway has essentially the same behavior as Ether in a shared-medium network.” • Not true! C
Removing exponential backoff from TCP and its implications • Other reasons: no admission control, finite flow size, skewed traffic distribution, etc. • When to resend a packet? • Implicit packet conservation principle • As soon as the retransmission timeout expires • End-to-end performance can only improve if we remove the exponential backoff from TCP • Implications • Significant improvement of response times for short and interactive TCP flows
Multiple QoS Constraints • The Internet evolves towards the global multiservice IP network • Diverse applications and different QoS requirements • Many applications have multiple QoS requirements • Video streaming, VoIP, Video conferencing, etc. • Need support for end-to-end QoS guarantee under multiple constraints • Multiple QoS constraints often make the routing problem intractable
QoS provisioning using overlay networks • Build Overlay Backbone • Deploy overlay nodes at strategic locations in the Internet • Provide support for per-flow forwarding • e.g. Anagran Flow Aware Routers • Flow route management architecture • Discover and setup end-to-end paths for individual flows with diverse flow QoS requirements • Monitor end-to-end flow performance to trigger path adaptation
AS1 AS3 AS4 AS2 Overlay flow QoS management architecture Configure intermediate overlay nodes for per-flow forwarding Sensing local link characteristics Adapt to different path dynamically as current path fails to meet QoS parameters Find a path to X with b/w > b, delay < d and loss < l% End user Overlay node Physical link Logical link
Contribution • Design a scalable QoS routing protocol which finds path under multiple constraints • Propose a distributed algorithm for dynamic path adaptation • Evaluate accuracy, efficiency and scalability of the protocol using large-scale simulation and compare with other existing approaches • Build a functional prototype using Click modular router
Design challenges • Multiple QoS metrics • Finding a feasible path using Dijkstra’s algorithm is NP-Complete • Randomized and approximation algorithms • Single composite metric derived from multiple metrics • Paths might not meet individual QoS constraints • Dynamic overlay-link properties • Increases control message overhead