410 likes | 563 Views
Peer-to-peer IP telephony. Kundan Singh, Henning Schulzrinne and Salman Abdul Baset April 21, 2004 IRT lab - internal talk. What is P2P? How does it apply to IP telephony?. Agenda. Napster 1999-2001. Napster clones. Academic Research. Distributed computing. CAN Chord Pastry
E N D
Peer-to-peer IP telephony Kundan Singh, Henning Schulzrinne and Salman Abdul Baset April 21, 2004 IRT lab - internal talk
What is P2P? How does it apply to IP telephony? Agenda Napster 1999-2001 Napsterclones AcademicResearch Distributedcomputing CAN Chord Pastry Tapestry …… KaZaA Gnutella FreeNet …… SETI@home folding@home …… Total about 40 slides
Communication and collaboration Computer systems Magi Groove Skype Centralized Distributed mainframes workstations Peer-to-peer Client-server Napster Gnutella Kazaa Freenet C C P P Flat Hierarchical Pure Hybrid RPC HTTP DNS mount Gnutella Chord S Napster Groove File sharing Kazaa C C P P SETI@Home folding@Home C P Distributed computing Key idea • Share the resources of individual peers • CPU, disk, bandwidth, information, …
Goals • Resource aggregation - CPU, disk, … • Cost sharing/reduction • Improved scalability/reliability • Interoperability - heterogeneous peers • Increased autonomy at the network edge • Anonymity/privacy • Dynamic (join, leave), self organizing • Ad hoc communication and collaboration
Centralized index File names => active holder machines Sophisticated search Easy to implement Ensure correct search Centralized index Lawsuits Denial of service Can use server farms Napster P1 P5 S P2 P4 P2 Where is “quit playing games” ? FTP P3
Flooding Overlay network Decentralized Robust Not scalable. Use TTL. Query can fail Can not ensure correctness Gnutella P P P P P P P P P
Super-nodes Election: capacity bandwidth, storage, CPU and availability connection time public address Use heterogeneity of peers Inherently non-scalable If flooding is used KaZaA (FastTrack) P P P P P P P P P P P P
File is cached on reverse search path Anonymity Replication, cache Similar keys on same node Empirical log(N) lookup TTL limits search Only probabilistic guarantee Transaction state No remove( ) Use cache replacement FreeNet 2 P 1 P P 3 12 7 11 4 6 P 10 P 5 P 9 8 P
Structured • Efficient searching • Proximity • Locality • Data availability • Decentralization • Scalability • Load balancing • Fault tolerance • Maintenance • Join/leave • Repair Query time, number of messages, network usage, per node state Goals [re-visited] • If present => find it • Flooding is not scalable • Blind search is inefficient P2P systems Unstructured
Distributed Hash Tables • Types of search • Central index (Napster) • Distributed index with flooding (Gnutella) • Distributed index with hashing (Chord) • Basic operations find(key), insert(key, value), delete(key)
CANContent Addressable Network • Each key maps to one point in the d-dimensional space • Each node responsible for all the keys in its zone. • Divide the space into zones. 1.0 C D E B A 0.0 1.0 0.0 C D E B A
CAN [2] 1.0 .75 .5 .25 0.0 E E A A B C B C X X Z D D (x,y) 0.0 .25 .5 .75 1.0 Node Z joins Node X locates (x,y)=(.3,.1) State = 2d Search = dxn1/d
0 1 2 3 4 5 6 7 8 Chord • Identifier circle • Keys assigned to successor • Evenly distributed keys and nodes 1 54 8 58 10 14 47 21 42 38 32 38 24 30
Chord [2] • Finger table: logN • ith finger points to first node that succeeds n by at least 2i-1 • Stabilization after join/leave 1 54 8 58 10 14 47 21 42 38 32 38 24 30
Tapestry • ID with base B=2b • Route to numerically closest node to the given key • Routing table has O(B) columns. One per digit in node ID. • Similar to CIDR – but suffix-based 763 427 364 123 324 365 135 564 **4 => *64 => 364
Pastry • Prefix-based • Route to node with shared prefix (with the key) of ID at least one digit more than this node. • Neighbor set, leaf set and routing table. d471f1 d467c4 d46a1c d462ba d4213f Route(d46a1c) d13da3 65a1fc
Other schemes • Distributed TRIE • Viceroy • Kademlia • SkipGraph • Symphony • …
P2P overlay JOIN FIND alice 128.59.19.194 Alice 128.59.19.194 P2P for IP telephony REGISTER alice@columbia.edu =>128.59.19.194 INVITE alice@columbia.edu columbia.edu 128.59.19.194 Alice’s host 128.59.19.194 Bob’s host
P P P P P P P P P P P P Skype From the KaZaA community • Host cache of some super nodes • Bootstrap IP addresses • Auto-detect NAT/firewall settings • Protocol among super nodes – ?? • Guaranteed to find if exists and logged in recently (< 72 hours) • Allows searching a user (e.g., kun*) • History of known buddies • All communication is encrypted • Promote to super node • Based on availability, capacity • Conferencing
I am waiting to be “killer” Ya right!! Napster got killed already! P2P is my next “killer” application
Lessons learnt • Auto-configure • Adaptive • Client-server, P2P • Search options, state overhead • Node, super node • No blind search, flooding • Use DHT
SIPeer Proposed extension of SIPua/SIPc • Goals • P2P design, but SIP-based • No configuration • Conferencing, offline messaging • Interoperate with existing SIP systems • Inspired by Skype • Use existing DHT schemes • Key=hash(user@domain)
Option-1: No REGISTER Node computes key based on user ID Nodes join the overlay based on ID One node one user Option-2: With REGISTER REGISTERs with nodes responsible for its key Refreshes periodically Allows offline messages (?) Find(user) 56 REGISTER alice=42 58 42 alice=42 12 bob=12 42 14 12 REGISTER bob=12 32 24 24 sam=24
d471f1 1 d467c4 d46a1c 8 d462ba 58 54 d4213f 14 10 47 21 Route(d46a1c) d13da3 42 38 32 65a1fc 38 24 30 Design alternatives servers 1 54 10 38 24 30 clients Use DHT in server farm Use DHT for all clients; But some are resource limited Use DHT among super-nodes Hierarchy Dynamically adapt
sipd DB Node starts up • REGISTER with SIP registrar • Discover peers • First-time bootstrap peers • Detect if multicast is supported (How?) • Multicast with incremental TTL • Cache for subsequent startup • Detect local NAT/firewall • Detect existing “buddies” (2) REGISTER (1) DNS alice@columbia.edu (3) Detect peers
Node joins • Super-nodes are SIP registrars • REGISTER with a Super-node REGISTER sip:node-address To: <sip:user@domain> • Periodically monitor peers • OPTIONS as heart-beat message REGISTER OPTIONS DHT
Super-nodes • Initial bootstrap super-nodes • Never allow capacity to exceed • When to become super-node • Local decision; can be influenced by existing peer • If REGISTER received • Local key => store locally • Else, forward REGISTER to appropriate nodes • Super-node refreshes REGISTER on behalf • Should be in “public” address space (?) REGISTER key=42 REGISTER DHT 42
Node leaves • Graceful leave • Un-REGISTER; what about server=>client? • Node leaves: no problem • Super-node leaves • Attached nodes detect and re-REGISTER • New REGISTER goes to new super-nodes • Super-nodes adjust DHT accordingly REGISTER key=42 REGISTER OPTIONS DHT 42 42
Dialing out • Call, instant message, etc. INVITE sip:hgs10@columbia.edu MESSAGE sip:alice@yahoo.com • If existing buddy, use cache first • If not found • SIP-based lookup (DNS NAPTR, SRV,…) • P2P lookup • Send to super-nodes: proxy • Use DHT to locate: proxy or redirect INVITE key=42 Last seen 302 INVITE DHT 42
Offline messages • INVITE or MESSAGE fails • Responsible node stores voicemail, instant message. • Delivered using MWI (?) or when online detected • Replicate the message at redundant nodes • Sequence number prevents duplicates • Security: How to avoid spies? • How to recover if all responsible nodes leave?
*@columbia.edu; keywords Some super nodes can act as search servers Use redirect response to appropriate nodes when a query (e.g., INVITE) is received Alternative: Hierarchical design (not pure P2P) *@columbia.edu=38 58 38 alice@columbia.edu, Interest:movies 14 Interest:movies=29 32 29 Sophisticated search
Conferencing • Conference servers REGISTER all the conference addresses • Bad – centralized conferencing • Which peer should act as mixer? • Proximity info for application level multicast • Cascaded mixers
NAT/firewall • Super-node tunnels to internal node on TCP connection initiated by internal node • Use flow control without retransmission, if possible (?) • Codecs that work best on TCP (?)
Mobile nodes • Mobile-IP: no issues • SIP-based mobility • Periodically detect local IP • If it changes, re-REGISTER • If super-node moves, similar to leave+join
Embedded devices • Automatically detect available resources • Cap on host cache size • Cap on CPU/memory/bandwidth utilization • Select best codec • Automatically disable p2p if local domain registrar is found (?) • . . .
Explosive growth • Cache replacement at super-nodes • Last seen many days ago • Cap on local disk usage (automatic) • Forcing a node to become super node • Graceful denial of service if overloaded • Switching between flooding, CAN, Chord, … • . . .
Implementation • Implementation • No use unless FREE • and used by masses • Simulation • Not different from existing DHT; on paper only • Combine • . . .
More open issues • Security • Anonymity, encryption, • Attack/DOS-resistant, SPAM-resistant • Malicious version of SIPeer • Protecting voicemails from storage nodes • Optimization • Locality, proximity • Motivation • Why should I run as super-node?
d471f1 d467c4 d46a1c d462ba d4213f 763 427 C C P P S 364 123 Route(d46a1c) d13da3 324 C C P P 365 135 564 65a1fc C P Conclusions • P2P useful • Scalable, reliable • No configuration • Not as fast as client/server • P2P/SIP • Basic operations easy • Some potential issues • Security • Performance • Quality (audio)