550 likes | 688 Views
Reliable and Scalable Multimedia Communication. PhD thesis proposal by Kundan Singh Advisor: Henning Schulzrinne Nov 29, 2004. Agenda for the presentation. What is the problem? Why is it important? Results so far Difference with related work Plan for finishing. 27 slides (plus backup).
E N D
Reliable and Scalable Multimedia Communication PhD thesis proposal by Kundan Singh Advisor: Henning Schulzrinne Nov 29, 2004
Agenda for the presentation • What is the problem? • Why is it important? • Results so far • Difference with related work • Plan for finishing 27 slides (plus backup)
Table of content Introduction Related work Internet telephony infrastructure Architecture (unified messaging, conferencing, SIP-H.323 translator, IVR, GUI, security, NAT traversal), interoperability Request routing and user registration Requirements, redundancy, P2P Multi-party collaboration Reliability and scalability Research plan publications Nossdav’01 (Towards junking the PBX) IEEE IC’02 (Integrating Internet telephony services)+detailed TR Iptel’01 (Centralized conferencing using SIP) IPTS’01 (Unified messaging using SIP and RTSP)+TR Iptel’00 (Interworking between SIP/SDP and H.323)+TR+ID ICC’03 (Integrating VoiceXML with SIP services)+NYMAN’02 WMASH’03 (MobileNAT)+pending journal NYMAN’04 (P2P IP telephony) + TR TR (Failover and load sharing in SIP) MMCN’04 (Comprehensive multi-platform collaboration) + detailed TR Outline of the proposal document
database (SCP)10 million customers 2 million lookups/hour database (SCP)for freephone, calling card, … local telephone switch(class 5 switch)10,000 customers 20,000 calls/hour signaling router (STP)1 million customers 1.5 million calls/hour signaling network (SS7) signaling router(STP) regional telephone switch(class 4 switch)100,000 customers 150,000 calls/hour Telephone reliability(PSTN: Public Switched Telephone Network) “bearer” network telephone switch(SSP)
REGISTER INVITE INVITE DNS Internet telephony(SIP: Session Initiation Protocol) alice@yahoo.com yahoo.com example.com bob@example.com 129.1.2.3 192.1.2.4 DB
IP PSTN SIP network architectureScalability requirement depends on role Cybercafe ISP IP network IP phones GW ISP MG MG SIP/MGC SIP/PSTN GW SIP/MGC Carrier network MG GW PBX T1 PRI/BRI PSTN phones PSTN
Reliability and scalabilityfor call routing, registration, conferencing, voicemails • Requirements • Reliable • Mean Time Between Failures (MTBF), Mean Time To Recover (MTTR), percentage availability • Scalable • Registration rate, call rate, #requests/s • Server and network components • Proposed solutions • Server redundancy • Apply existing web-redundancy designs • Evaluate quantitatively • Peer-to-peer • Novel P2P-SIP architecture • Evaluate quantitatively
INVITE REGISTER INVITE REGISTER INVITE REGISTER Replicate registration or search on call Server redundancyThe problem: failure or overload
Server redundancyKnown techniques • Client-based • Cisco phones: primary and backup proxy • DNS • NAPTR, SRV • IP address takeover • Database redundancy
High availabilityFailover in our test bed - CINEMA Web scripts Web scripts D1 D2 Master/ slave Slave/ master replication P1 P2 phone.cs.columbia.edu sip2.cs.columbia.edu REGISTER _sip._udp SRV 0 0 5060 phone.cs.columbia.edu SRV 1 0 5060 sip2.cs.columbia.edu proxy1 = phone.cs backup = sip2.cs
High availabilityMore issues • Client re-sends INVITE to P2 • Immediately on ICMP error • Or after 10s otherwise • sipd has in-memory cache • Refresh registration much before expiry • Cisco phone registers to P1 and P2 • Web access gets delayed information
Call setup latency Client retry timeout (T1), DNS TTL User unavailability None (refresh; double register) Registration refresh interval (Tr), cache refresh interval (Tc), client retry timeout (T2), DB replication delay, DNS TTL Web access latency #servers Tradeoff: reliability vs capacity High availabilityMeasurements on failover Master/ slave Slave/ master D2 D1 DNS Caller P2 P1 T1 Callee D2 P2 P1 D1 Tc Td Tc A Tr T2 A Tc
REGISTER INVITE ScalabilityLoad sharing: redundant proxies and databases • REGISTER • Write to D1 & D2 • INVITE • Read from D1 or D2 • Database write/ synchronization traffic becomes bottleneck P1 D1 P2 D2 P3
ScalabilityLoad sharing: divide the user space • Proxy and database on the same host • Stateless proxy can become overloaded • Use many • Hashing • Static vs dynamic P1 D1 a-h P2 D2 i-q P3 D3 r-z
((tr/D)+1)TN = (A/D) + B ((tr+1)/D)TN = (A/D) + (B/D) High scale Low reliability ScalabilityComparison of the two designs P1 P1 a-h D1 D1 P2 P2 i-q D2 D2 P3 P3 D2 r-z Total time per DB D = number of database servers N = number of writes (REGISTER) r = #reads/#writes = (INV+REG)/REG T = write latency t = read latency/write latency
Master Slave Master Slave Reliability and scalabilityTwo stage architecture for CINEMA a*@example.com a.example.com _sip._udp SRV 0 0 a1.example.com SRV 1 0 a2.example.com a1 s1 a2 sip:bob@example.com s2 sip:bob@b.example.com b*@example.com b.example.com _sip._udp SRV 0 0 b1.example.com SRV 1 0 b2.example.com s3 b1 b2 ex example.com _sip._udp SRV 0 40 s1.example.com SRV 0 40 s2.example.com SRV 0 20 s3.example.com SRV 1 0 ex.backup.com Request-rate = f(#stateless, #groups) Bottleneck: CPU, memory, bandwidth? Failover latency: ?
Slave Master Slave Master Reliability and scalabilityAnalysis, simulation and measurement proposal Rp Mp • When is stateless proxy stage needed • What are the optimal values for S,B,P • for required scalability (1-10 million BHCA) and reliability (99.999%) • using commodity hardware a1 Rs Ms P=1+1 s1 a2 S=3 = R + P REGISTER+ INVITE, etc B=2 s2 /B r, p s3 s b1 b2 ex
C C P P S C C P P C P Server-based vs peer-to-peer • Server-based • Cost: maintenance, configuration • Central points of failures • Controlled infrastructure (e.g., DNS) • Peer-to-peer • Robust: no central dependency • Self organizing, no configuration • Scalability ?
We propose: P2P-SIP • Unlike server-based SIP architecture • Unlike proprietary Skype architecture • Robust and efficient lookup using DHT • Interoperability • DHT algorithm uses SIP communication • Hybrid architecture • Lookup in SIP+P2P • Unlike file-sharing applications • Data storage, caching, delay, reliability • Disadvantages • Lookup delay and security
P2P-SIPBackground: DHT (Chord) • Identifier circle • Keys assigned to successor • Evenly distributed keys and nodes • Finger table: logN • ith finger points to first node that succeeds n by at least 2i-1 • Stabilization for join/leave 1 54 8 58 10 14 47 21 42 38 32 38 24 30
d471f1 1 d467c4 d46a1c 8 d462ba 58 54 d4213f 14 10 47 21 Route(d46a1c) d13da3 42 38 32 65a1fc 38 24 30 P2P-SIPDesign Alternatives servers 1 54 10 38 24 30 clients Use DHT in server farm Use DHT for all clients; But some are resource limited Use DHT among super-nodes
Discover DHT (Chord) User location Audio devices User interface (buddy list, etc.) ICE RTP/RTCP Codecs SIP P2P-SIPNode architecture: registrar, proxy, user agent • DHT communication using SIP REGISTER • Known node: sip:15@192.2.1.3 • Unknown node: sip:17@sippeer.net • User: sip:alice@example.com Signup, Find buddies IM, call On reset Signout, transfer On startup Leave Find Join REG, INVITE, MESSAGE Peer found/ Detect NAT Multicast REG REG
1 30 26 9 19 11 P2P-SIPImplementation 31 • sippeer: C++, Linux, Chord • Node join and form the DHT • Node failure is detected and DHT updated • Registrations transferred on node shutdown • Co-located sipc can use sippeer service 29 31 25 26 15
P2P-SIPEvaluation - scalability • #messages depends on • Keep-alive and finger table refresh rate • Call arrival distribution • User registration refresh interval • Node join, leave, failure rates M={rs+ rf(log(N))2} + c.log(N) + (k/t)log(N) + (log(N))2/N • #nodes = f(capacity,rates) • CPU, memory, bandwidth • Verify by measurement and profiling
P2P-SIPEvaluation – reliability and call setup latency • User availability depends on • Super-node failure distribution • Node keep-alive and finger refresh rate • User registration refresh rate • Replicate user registration • Measure effect of each • Call setup latency • Same as DHT lookup latency: O(log(N)) • Calls to known locations (“buddies”) is direct • DHT optimization can further reduce latency • User availability and retransmission timers • Measure effect of each
Research plan Thank you
PublicationsConference, workshop, technical report, magazine • H. Schulzrinne, K. Singh and X. Wu, "Programmable Conference Server", Columbia University Technical Report CUCS-040-04, NY, Oct 2004. • K. Singh and H. Schulzrinne, "Peer-to-peer Internet Telephony using SIP", New York Metro Area Networking Workshop, CUNY, NY, Sep 2004. K. Singh and H. Schulzrinne, "Peer-to-peer Internet Telephony using SIP", Columbia University Technical Report CUCS-044-04, NY, Oct 2004. • K. Singh and H. Schulzrinne, "Failover and Load Sharing in SIP Telephony", Columbia University Technical Report CUCS-011-04, NY, May 2004. • K. Singh, Xiaotao Wu, J. Lennox and H. Schulzrinne, "Comprehensive Multi-platform Collaboration", MMCN 2004 - SPIE Conference on Multimedia Computing and Networking, Santa Clara, CA, Jan 2004. K. Singh, Xiaotao Wu, J. Lennox and H. Schulzrinne, "Comprehensive Multi-platform Collaboration", Columbia University Technical Report CUCS-027-03, NY, Nov 2003. • M. Buddhikot, A. Hari, K. Singh and S. Miller, "MobileNAT: A new Technique for Mobility across Heterogeneous Address Spaces", WMASH 2003 - ACM International Workshop on Wireless Mobile Applications and Services on WLAN Hotspots, San Diego, CA, Sep 2003. • K. Singh, A. Nambi and H. Schulzrinne, "Integrating VoiceXML with SIP services", ICC 2003 - Global Services and Infrastructure for Next Generation Networks, Anchorage, Alaska, May 2003. K. Singh, A. Nambi and H. Schulzrinne, "Integrating VoiceXML with SIP services", Second New York Metro Area Networking Workshop, Columbia University, NY, Sep 2002. • K. Singh, W. Jiang, J. Lennox, S. Narayanan and H. Schulzrinne, "CINEMA: Columbia InterNet Extensible Multimedia Architecture", Columbia University Technical Report CUCS-011-02, NY, May 2002. W. Jiang, J. Lennox, H. Schulzrinne and K. Singh, "Towards Junking the PBX: Deploying IP Telephony", NOSSDAV 2001. W. Jiang, J. Lennox, S. Narayanan, H. Schulzrinne, K. Singh and X. Wu, "Integrating Internet Telephony Services", IEEE Internet Computing (magazine), May/June 2002 (Vol. 6, No. 3). • K. Singh, Gautam Nair and H. Schulzrinne, "Centralized Conferencing using SIP", 2nd IP-Telephony Workshop (IPTel'2001), April 2001. • K. Singh and H. Schulzrinne, "Unified Messaging using SIP and RTSP", IP Telecom Services Workshop 2000, Atlanta, Georgia, U.S.A, Sept 2000. K. Singh and H. Schulzrinne, "Unified Messaging using SIP and RTSP", Columbia University Technical Report CUCS-020-00, NY, Oct 2000. • K. Singh, H.Schulzrinne, "Interworking Between SIP/SDP and H.323", 1st IP-Telephony Workshop (IPTel'2000), April 2000. K. Singh and H. Schulzrinne, "Interworking Between SIP/SDP and H.323", Columbia University Technical Report CUCS-015-00, NY, May 2000.
SIP VXML Web server Internet telephony infrastructureCINEMA: Columbia InterNet Extensible Multimedia Architecture CINEMA servers Telephone switch rtspd: media server Local/long distance 1-212-5551212 sipconf: Conference server Quicktime RTSP PSTN RTSP clients Department PBX sipum: Unified messaging sipd: Proxy, redirect, Registrar server Internal Telephone Extn: 7040 713x SQL database cgi SIP/PSTN Gateway vxml Web based configuration H.323 siph323: SIP-H.323 translator NetMeeting
libNT libcine libsip librtsp libsipapi rtplib++ libmedia libconf libdict libsnmp libcanon libdb++ Win32 stub Utilities parsing IPv6 Basic SIP library RTSP client SIP UA library RTP library RTP audio mixer Recording, files Hash table mySQL interface SIP MIB canonicalize CINEMAMy contribution in design and implementation CINEMA Applications RTSP media server SIP/VoiceXML browser SIP/H.323 gateway SIP/RTP conferencing SIP/RTSP unified messaging SIP proxy server rtspd sipvxml sip323 sipconf sipum sipd Xerces-C Flite Xerces-C OpenH323 CINEMA Libraries MySQL PWLib Resparse … and web-based GUI C/C++: 58K out of 187 KLOC Tcl: 30 KLOC
128.59.16.149 135.180.32.4 80 1733 135.180.32.4 128.59.16.149 1733 80 135.180.32.4 128.59.16.149 1733 80 Addr “V” Application Socket TCP/UDP IP Addr “A” moves Shim Layer Actual IP Virtual IP Net IF MobileNATArchitecture • Two IP addresses • Virtual IP (fixed host-id) • Actual IP (routable; changes) • DHCP, NAT, mobility manager CN 128.59.16.149 V=135.180.32.4 Anchor node (AN) MN MN A=135.180.54.7 135.180.32.6
MobileNATComparison with other work Y: yes N: no - :N/A O: optional IN:independent UD: Under Development 1: We assume Mobile IP with UDP tunneling for NAT
Interoperability with Nortel MCS • Nortel • We have MCS 5100 (and phones) • It uses proprietary protocol and SIP • CINEMA+Nortel: two models • Use both at the same time (reliable) • Split user base between the two • User registration, call setup and conferencing • Security and trust: out of scope
Related workIP telephony and multimedia communication • Unlike low cost VoIP: Vonage, AT&T • We provide enterprise infrastructure • There are enterprise IPtel: Cisco, Nortel • But redundancy architecture, interoperability, distributed components model differ • Collaboration: CSCW, SIGGROUP • Unlike web-centric, or application specific • We provide standard-based multimedia collaboration platform • Multimedia conferencing: Mbone, H.323 • Ours is SIP-based infrastructure, reuse existing tools and protocols such as RTSP, media server
Goal: Alternate between synchronous and asynchronous communication, and access from different devices and clients. Synchronous (tightly coupled) Video conference, IM, screen sharing, floor control, … Asynchronous (loosely coupled) File sharing, message board, … Messaging and notifications Personalized view Per-user calendar, access control, address book We try to incorporate… Long lived groups Design teams, committees, college classes Asymmetric events Lecture and lecture series Short-lived spontaneous interaction Current practice Email, teleconference Vendor specific tools, platform dependence Application specific E.g., collaborative software development Related workComprehensive multi-platform collaboration
Multi-party collaborationWhat is done, and what is left. • Sipconf: conference server • Audio, video, IM, screen, shared browsing, floor control • No XCON yet: use web interface • Small to medium size conferences • Cascaded conference mixer • #participants, audio delay • Failover • State sharing between servers
Related workAvailability for (web) servers • Availability = f(reliability,maintainability) • Reliability: time to failure pdf • Maintainability: time to recover pdf • Existing work on failover • TCP connection migration • IP address takeover • MAC address takeover • Reliable server pooling • Requires new protocol support in clients • Reliability analysis tools (www.relexsoftware.com) • Availability in the face of (DoS) attacks
Related workScalability for (web) servers • Existing work • Connection dispatcher • Content/session-based redirection • DNS-based load sharing • HTTP vs SIP • UDP+TCP, signaling not bandwidth intensive, no caching of response, read/write ratio is comparable for DB • SIP scalability bottleneck • Signaling (chapter 4), real-time media data, gateway • 302 redirect to less loaded server, REFER session to another location, signal upstream to reduce
Related workSIPStone: SIP server performance metric • Steady state rate for • successful registration, forwarding and unsuccessful call attempts measured using 15 min test runs. • Measure: #requests/s with given delay constraint. • Performance=f(#user,#DNS,UDP/TCP,g(request),L) where g=type and arrival pdf (#request/s), L=logging? • For register, outbound proxy, redirect, proxy480, proxy200. • Parameters • Measurement interval, transaction response time, register/s, calls/s, transaction failure probability<5%, • Shortcomings: • does not consider forking, scripting, Via header, packet size, different call rates, SSL. Is there linear combination of results?
Related work3GPP (release 5)’s IP Multimedia core network Subsystem uses SIP • Proxy-CSCF (call session control function) • First contact in visited network. 911 lookup. Dialplan. • Interrogating-CSCF • First contact in operator’s network. • Locate S-CSCF for register • Serving-CSCF • User policy and privileges, session control service • Registrar • Connection to PSTN • MGCF and MGW
P P P P P P P P P P P P Related work: Skype From the KaZaA community • Host cache of some super nodes • Bootstrap IP addresses • Auto-detect NAT/firewall settings • Similar to STUN and TURN • Protocol among super nodes – ?? • Allows searching a user (e.g., kun*) • History of known buddies • All communication is encrypted • Promote to super node • Based on availability, capacity • Conferencing • Problems: • Proprietary, single service, centralized login
Related workP2P • P2P networks • Unstructured (Kazaa, Gnutella,…) • Structured (DHT: Chord, CAN,…) • Skype and related systems • Flooding based chat, groove, Magi • P2P-SIP telephony • Proprietary: NimX, Peerio, • File sharing: SIPShare
Why we chose Chord? • Chord can be replaced by another • As long as it can map to SIP • High node join/leave rates • Provable probabilistic guarantees • Easy to implement • X proximity based routing • X security, malicious nodes
Related workJXTA vs Chord in P2P-SIP • JXTA • Protocol for communication (peers, groups, pipes, etc.) • Stems from unstructured P2P • P2P-SIP • Instead of SIP, JXTA can also be used • Separate search (JXTA) from signaling (SIP)
sipd DB P2P-SIPNode Startup columbia.edu • SIP • REGISTER with SIP registrar • DHT • Discover peers: multicast REGISTER • Join DHT using node-key=Hash(ip) • REGISTER with DHT using user-key=Hash(alice@columbia.edu) • Dialing out • Call, instant message, etc. INVITE sip:hgs10@columbia.edu MESSAGE sip:alice@example.com • Last seen, SIP NAPTR/SRV, DHT REGISTER alice@columbia.edu Detect peers REGISTER alice=42 58 42 12 14 REGISTER bob=12 32
P2P-SIPNode Leaves • Graceful leave • Un-REGISTER • Transfer registrations • Failure • Attached nodes detect and re-REGISTER • New REGISTER goes to new super-nodes • Super-nodes adjust DHT accordingly REGISTER key=42 REGISTER OPTIONS DHT 42 42
P2P-SIPAdvanced services • Offline messages • INVITE or MESSAGE fails => Responsible node stores voicemail, instant message. • Conferencing • Mixer, full mesh, multicast
P2P-SIPSecurity – open issues (threats, solutions, issues) • More threats than server-based • Privacy, confidentiality • Malicious node • Don’t forward all calls, log call history (spy),… • “free riding”, motivation to become super-node • Existing solutions • Focus on file-sharing (non-real time) • Centralized components (boot-strap, CA) • Assume co-operating peers ( • works for server farm in DHT • Collusion • Hide security algorithm (e.g., yahoo, skype) • Chord • Recommendations, design principles, …
My contribution in CINEMASip-h323: signaling translator • Background: ITU-T’s H.323 • Binary ASN.1 PER, collection of protocols (H.245, H.225.0, Q.931, RAS, H.450.x) • H.323 gatekeeper similar but not same as SIP server • Problems in interworking • Multi-stage dialing in H.323v1 • Fast start in v2 is optional • User registration • Both SIP and H.323 users should be reachable • Session description is more complex • End system should select the codecs • Security and QoS: end-to-end or not? • Solution • List different scenarios • No modification in SIP or H.323 • Direct RTP traffic if possible • Implementation