160 likes | 308 Views
Carrier-grade vs. Internet VoIP. Henning Schulzrinne (with Wenyu Jiang) Columbia University FCC Technical Advisory Council III Washington, DC – October 20, 2003. Overview. Previous talk: interactive communication services signaling & media Now focus on overall architecture:
E N D
Carrier-grade vs. Internet VoIP Henning Schulzrinne (with Wenyu Jiang) Columbia University FCC Technical Advisory Council III Washington, DC – October 20, 2003
Overview • Previous talk: interactive communication services • signaling & media • Now focus on overall architecture: • network & service availability • signaling services: SIP, H.323 • supporting services: DNS, DHCP, LDAP, … • network transport • network quality-of-service • packet loss, delay, jitter
Overview (on-going work, preliminary results, still looking for measurement sites, …) • Service availability • Measurement setup • Measurement results • call success probability • overall network loss • network outages • outage induced call abortion probability
Service availability • Users do not care about QoS • at least not about packet loss, jitter, delay • rather, it’s service availability how likely is it that I can place a call and not get interrupted? • availability = MTBF / (MTBF + MTTR) • MTBF = mean time between failures • MTTR = mean time to repair • availability = successful calls / first call attempts • equipment availability: 99.999% (“5 nines”) 5 minutes/year • AT&T (2003): • Sprint IP frame relay SLA: 99.5%
Availability – PSTN metrics • PSTN metrics (Worldbank study): • fault rate • “should be less than 0.2 per main line” • fault clearance (~ MTTR) • “next business day” • call completion rate • during network busy hour • “varies from about 60% - 75%” • dial tone delay
Example PSTN statistics Source: Worldbank
Measurement setup • Active measurements • call duration 3 or 7 minutes • UDP packets: • 36 bytes alternating with 72 bytes (FEC) • 40 ms spacing • September 10 to December 6, 2002 • 13,500 call hours
Call success probability • 62,027 calls succeeded, 292 failed 99.53% availability • roughly constant across I2, I2+, commercial ISPs
Overall network loss • PSTN: once connected, call usually of good quality • exception: mobile phones • compute periods of time below loss threshold • 5% causes degradation for many codecs • others acceptable till 20%
Network outages • sustained packet losses • arbitrarily defined at 8 packets • far beyond any recoverable loss (FEC, interpolation) • 23% outages • make up significant part of 0.25% unavailability • symmetric: AB BA • spatially correlated: AB AX • not correlated across networks (e.g., I2 and commercial)
Outage-induced call abortion probability • Long interruption user likely to abandon call • from E.855 survey: P[holding] = e-t/17.26 (t in seconds) • half the users will abandon call after 12s • 2,566 have at least one outage • 946 of 2,566 expected to be dropped 1.53% of all calls
Conclusions from measurement • Availability in space is (mostly) solved availability in time restricts usability for new applications • initial investigation into service availability for VoIP • need to define metrics for, say, web access • unify packet loss and “no Internet dial tone’’ • far less than “5 nines” • working on identifying fault sources and locations • looking for additional measurement sites
What’s next? • Existing SLAs are mostly useless • too many exceptions • wrong time scales: month vs. minutes • no guarantees for interconnects • Existing measurements similarly dubious • Limited ability to learn from mistakes • what are the primary causes of service unavailability? • what can I do to protect myself – multi-homing via same fiber? diverse access mechanisms? • Consumers of services have no good ways to compare service availability • only some very large customers may get access to carrier-internal data • Thus, market failure • Need published metrics • similar to switch availability reporting