370 likes | 521 Views
Peer-to-Peer Intro. 5.4.2005 Jani & Sami Peltotalo. Overview of P2P. Overlay networks Current P2P applications P2P file sharing Instant messaging / voice over IP P2P distributed computing. P2P Architectures. 1G P2P: Centralized Network. Fast search/query response times
E N D
Peer-to-Peer Intro 5.4.2005 Jani & Sami Peltotalo
Overview of P2P • Overlay networks • Current P2P applications • P2P file sharing • Instant messaging / voice over IP • P2P distributed computing
1G P2P: Centralized Network • Fast search/query response times • Simple Protocol • Provides a high degree of performance and resilience • Susceptible to being shutdown, single server or server farm • e.g. Napster
2G P2P: Decentralized Network • Slow search/query response times which generates large volumes of network traffic • Network resilience and performance governed by users' PCs and their network connectivity • No central points of failure or control • e.g. Gnutella 0.4
3G P2P: Hybrid Architecture • Improved search/query response times, with less traffic generated per query than decentralized networks • The deployment of super-peers provides a high degree of performance and resilience • No central points of failure or control • e.g. FastTrack, Gnutella 0.6 super-peers
4G P2P: Different type of architectures • BitTorrent: centralized • eDonkey2000: semi-centralized • Overnet: decentralized
FastTrack • Clients: KaZaA, iMesh, Grokster... • MP3s & entire albums, videos, games • Decentralized network, supernodes act as temporary indexing servers (hierarchical architecture) • Control data encrypted • Everything in HTTP request and response messages • Optional parallel downloading of files
FastTrack: Architecture • Each peer is either a supernode or is assigned to a supernode • Selection criterias: CPU, memory, network connection • Each SN has about 100-150 children nodes and has 30-50 TCP connections with other supernodes • SN tracks the content and IP of its children nodes, not content under its neighboring SNs supernodes originalnodes
FastTrack: Metadata • When ON connects to SN, it uploads its metadata • For each file: • File name • File size • Content Hash (MD5+CRC) • File descriptors: used for keyword matches during query • Content Hash: • When peer A selects file at peer B, peer A sends ContentHash in HTTP request • If download for a specific file fails (partially completes), ContentHash is used to search for new copy of file
FastTrack: Overlay Maintenance • List of potential supernodes included within software download • New peer goes through the list until it finds operational supernode • Node “pings” (5-6) supernodes on the list and connects with the first replied SN • Connects and obtains more up-to-date list, with 200 entries • SNs in the updated list are “close” to ON • If supernode goes down, node goes through the updated list and finds new supernode
FastTrack: Queries • Node first sends query to supernode • Supernode responds with matches • If x matches found, done • Otherwise, supernode forwards query to subset of supernodes • If total of x matches found, done • Otherwise, query further forwarded • Probably by original supernode rather than recursively
FastTrack: Parallel Downloading and Recovery • If file is found in multiple nodes, user can select parallel downloading • Identical copies identified by ContentHash • HTTP byte-range header used to request different portions of the file from different nodes • Automatic recovery when server peer stops sending file • ContentHash is used to search for new copy of file
eDonkey2000 (ED2K) • Semi-centralized network, includes index servers • Many clients: eDonkey2000, MLDonkey, eMule, Shareaza... • Index server: Lugdunum • Used also for legal content delivery • Files identified by hash (MD4) • Possible to search files using web, founded ed2k links can be used to start file download • ed2k://|file|gentoo.linux.install-x86-minimal-2004.1 [found via www.FileDonkey.com].iso|85764096|F1819D1C731923327E140F09DB7400B6|/)
ED2K • Communication: • client-connected index server: TCP • client-other index servers: UDP • index server-index server: UDP • client-client: TCP • File transfer using Multisource File Transmission Protocol (MFTP) • also HTTP and BitTorrent supported
ED2K: Registration Index Server 1 Index Server 2 Index Server 3 Register to server, tell server own shared files Peer 4 (registered to index server 3) ZZZZ.txt ZZZZ.exe XXXX.txt XXXX.exe Index Server 1 XXXX.exe Peer 2 YYYY.txt Peer 2 YYYY.exe Peer 2 Peer 1 Index Server 2 YYYY.txt Peer 3 ZZZZ.exe Peer 3 YYYY.txt YYYY.exe XXXX.exe YYYY.txt ZZZZ.exe Index Server 3 ZZZZ.txt Peer 4 ZZZZ.exe Peer 4 Peer 2 (registered to index server 1) Peer 3 (registered to index server 2)
ED2K: Registration Reply Index Server 1 Index Server 2 Index Server 3 List of other index servers known by index server 1 Peer 4 (registered to index server 3) ZZZZ.txt ZZZZ.exe XXXX.txt XXXX.exe Peer 1 (registered to index server 1) YYYY.txt YYYY.exe XXXX.exe YYYY.txt ZZZZ.exe Index Server 1 XXXX.txt Peer 1 XXXX.exe Peer 1 & Peer 2 YYYY.txt Peer 2 YYYY.exe Peer 2 Peer 2 (registered to index server 1) Peer 3 (registered to index server 2)
ED2K: File Search Index Server 1 Index Server 2 Index Server 3 Search Files (UDP) Search Files (TCP) Peer 4 • Search Files message includes: • keyword • optionally • - min file size • - max file size • - availability • - etc. Peer 1 Peer 2 Peer 3
ED2K: File Search Reply Index Server 1 Index Server 2 Index Server 3 Search File Results (UDP) Search File Results (TCP) • Search File Results message • includes one or more file info: • file hash • client IP and port (optional?) • file name Peer 4 Peer 1 Peer 2 Peer 3
ED2K: File Downloading 1/3 Index Server 1 Index Server 2 Index Server 3 Get Sources (UDP) Get Sources (TCP) • Done if Search File Results • message(s) don’t include • client IP and port pair(s) or • ED2K link is used to start • downloading • includes: • - file hash Peer 4 Peer 1 Peer 2 Peer 3
ED2K: File Downloading 2/3 Index Server 1 Index Server 2 Index Server 3 Found Sources (UDP) Found Sources (TCP) • Found Sources message • includes: • file hash • address list • - client IP and port Peer 4 Peer 1 Peer 2 Peer 3
ED2K: File Downloading 3/3 Index Server 1 Index Server 2 Index Server 3 Peer 4 File requests and downloading Peer 1 Peer 2 Peer 3
BitTorrent • Centralized network, includes tracker • .torrent files • Google search for .torrents • Legal material available
BitTorrent: Get .torrent HTTP Server Tracker Seed 1 GET .torrent file .torrent file Leecher 1 In .torrent file: • file size • file name • hash of file (SHA1) • url of tracker Downloader Seed 2
BitTorrent: Get Peer List HTTP Server Tracker Seed 1 GET-announce Response-peer list Leecher 1 Downloader Seed 2
BitTorrent: Query File Pieces HTTP Server Tracker Seed 1 GET pieces of file Leecher 1 Downloader Seed 2
BitTorrent: File Pieces HTTP Server Tracker Info about download status Seed 1 pieces of file Leecher 1 Leecher 2 Seed 2
BitTorrent: Status Information HTTP Server Tracker Info about complete download Seed 1 Seed 3 Seed 4 Seed 2
Skype • Skype is a P2P VoIP client developed by the people who did KaZaA • Allows its users to place voice calls and send text messages to other users of Skype clients • Two types of nodes in the overlay network, ordinary hosts (OH) and super nodes (SN) • OH is a Skype application that can be used to place voice calls and send text messages • SN is an ordinary host’s end-point on the Skype network • Any node with a public IP address having sufficient CPU, memory, and network bandwidth is a candidate to become a SN
Skype • OH must connect to a SN and must register itself with the Skype login server for a successful login • 7 bootstrap super nodes • The host cache (HC) is a list of super node IP address and port pairs that OH builds and refreshes regularly • HC contains a maximum of 200 entries
Skype Network Skype Login Server Super Nodes Message exchange during login
Skype • Uses its Global Index technology to search for a user • Firewall traversal: First UDP, second TCP, third TCP port 80 (HTTP), fourth TCP port 443 (HTTPS) • Call signaling is carried always over TCP
NAT and Firewall Traversal • If caller is behind port-restricted NAT, call signaling (TCP) is forwarded through a node, which has a public IP address • If either caller or callee or both are behind port-restricted NAT voice traffic (UDP) is forwarded through the same node - If both caller and callee have a public IP address, call signaling (TCP) and voice traffic (UDP) flow directly between them - If both caller and callee are behind port-restricted NAT and UDP-restricted firewall, then signaling traffic and voice traffic is forwarded through another node over TCP