380 likes | 816 Views
P2P Overlay Network for TCP Programming with UDP Hole Punching. Takayuki Okamoto, Taisuke Boku, Mitsuhisa Sato, Osamu Tatebe Graduate School of Systems and Information Engineering, University of Tsukuba. Abstract. Large amount of idle PCs in the world Behind NAT and firewall
E N D
P2P Overlay Network for TCP Programming with UDP Hole Punching Takayuki Okamoto, Taisuke Boku, Mitsuhisa Sato, Osamu Tatebe Graduate School of Systems and Information Engineering, University of Tsukuba 2nd NEGST workshop
Abstract • Large amount of idle PCs in the world • Behind NAT and firewall • Special programming is required to communicate with each other • Relay server, NAT traversal • We are developing a P2P communication library to ease to use PCs behind NAT and firewall • UDP hole punching • Original reliable communication library on UDP/IP • User level management We use the term of “NAT” for both NAT boxes and firewalls hereafter 2nd NEGST workshop
Outline • Motivation and objective • P2P computing • Proposal of a scalable communication framework based on NAT traversal • Design and implementation of communication library • Evaluation of communication performance • Performance for UDP with our reliable communication library • Works in France 2nd NEGST workshop
Motivation & background • NAT problem • Most of computing nodes are behind firewalls or NAT (Network Address Translation) boxes • These nodes can’t communicate with each other directly • With relay transfer, the bandwidth of relay-nodes becomes a bottleneck • NAT traversal techniques • With several negotiation procedures, the nodes can communicate directly through intermediate NATs • Complicated negotiation is required on each application program 2nd NEGST workshop
Objective • Goal: providing a communication framework for efficient and easily programmable HPC-P2P computing • Easy to use nodes behind NATs • High scalability • High throughput • High portability for a large variety of environments 2nd NEGST workshop
Requirement specification • Direct communication based on NAT traversal • Name space independent from the physical one • Fully distributed management system • User-level implementation 2nd NEGST workshop
Overlay networks • Virtual networks constructed on application layer • Generally defined as “a routing (relay) system among involved nodes” • Independent from the physical network • Relay nodes may become bottlenecks • Applications neglect the network topology • Our system • Name space and communication methods between any pair of nodes without packet-relay • Applications can be designed for effective communication on physical network • Supporting both applications and frameworks 2nd NEGST workshop
Design concept of our system • Two different types of communication • Managements and controls in our system • Data transfer on applications 2nd NEGST workshop
Design of communication library • Socket API compatible with TCP/IP • Easy porting of existing applications written in TCP/IP • Easy programming with large flexibility - not limited to “master-slave” style • Communication method is automatically selected • Pure (direct) TCP/IP is the best • UPnP is supported by wide class of home-use NATs • UDP hole punching is mostly available on NATs⇒ for TCP-programming, reliable streaming communication feature must be provided by software 2nd NEGST workshop
Reliable communication on UDP/IP • RI2N/UDP • Developed by JST-CREST “Mega-Scale Computing” Project • Basically designed for fault-tolerant communication on PC cluster with Ethernet • Based on UDP/IP, but provides TCP-like streaming communication, retransmission and simple congestion control algorithm • Porting to our communication layer for P2P computing⇒ SoU (Stream on UDP) library 2nd NEGST workshop
Preliminary performance evaluation • Performance evaluation on SoU library • Throughput • Latency • Environment • Two client nodes in two houses under different ISPs over the Internet • The server node in University of Tsukuba • Home-use “broadband router” to be used • BBR-4HG : max 92Mbps • BLR3-TX4 : max 90Mbps • Four connection methods • TCP DMZ • SoU DMZ • TCP relay • SoU + UDP hole punching University SINET(MEXT) ISP1(So-net) ISP2(BB.Excite) 2nd NEGST workshop
Connection methods (1) and (2) • Method (1): TCP/IP with DMZ function of NAT • Method (2): SoU with “UDP” DMZ function of NAT • DMZ function: port forwarding function to transfer all inbound packets on NAT to a node behind NAT TCP DMZ SoU DMZ TCP/IP or UDP/IP setting manually 2nd NEGST workshop
Connection method (3) • TCP/IP packet relay through Server • Each node makes a TCP/IP channel with the server • The server relays packets from one side to the other side through TCP/IP channel • Two times of transmission is required to send a packet TCP relay TCP/IP 2nd NEGST workshop
Connection method (4) SoU + UDP hole punching • SoU over UDP hole punching • All nodes share the information of IP addresses and ports by the server through the management channel with TCP/IP • Two client nodes establish a direct communication channel with UDP/IP by UDP hole punching • Over this UDP channel, SoU is used for streaming and reliable communication between Node-A and Node-B Information = address + port Data transfer SoU connection UDP hole punching 2nd NEGST workshop
Throughput • TCP DMZ vs. SoU + UDP hole punching • Simple vs. complex • Different only 15% • Realizing P2P direct communication without NAT problem • TCP DMZ vs. TCP relay • Direct vs. indirect • TCP relay is 45% higher • Communication path between ISPs • Throughput depends on bandwidth between ISPs • University has a strong connection with both ISPs • TCP relay makes a bottleneck on scalable system • SoU + UDP hole punching is the best way for P2P computing Single-sided burst transfer 2nd NEGST workshop
Latency • Three methods • Very small difference • Physical latency is large • Difference among protocols is relatively small • Same hop-count ≈ same latency • TCP relay • The largest • Double time hop-count • Latency depends on the number of hops in WAN • Throughput depends on absolute bandwidth Average time for 1 byte message transfer 2nd NEGST workshop
Works in France (1) • Porting UDP hole punching in Private Virtual Cluster (tun version) • PVC provides IP level virtualization • Reliability is not required • Throughput on LAN achieves 90 Mbps on 100BASE-TX with tuning of MTU 2nd NEGST workshop
Works in France (2) • Making arrangements for performance evaluation between France and Japan • Nodes in Grid5000 can be used only with their self • 2 nodes in France and 4 nodes in Japan are available 2nd NEGST workshop
Future works • Performance improvement of SoU library • Implementing more sophisticated algorithms of flow control • Performance evaluation between France and Japan • Comparing SoU with TCP • Upgrading SoU for throughput with large latency 2nd NEGST workshop
The Procedure of UDP hole punching Sharing the Information of IP address and port Server to NAT-2:2000 ×? to NAT-1:1000 ×? Created by outbound packets This method is available with “Cone NATs” 2nd NEGST workshop
Motivation & background • P2P (Peer-to-peer) computing and its potential power • Utilize a great potential computation power provided by a number of PCs • Public Resource Computing : Aggregating the computation power of idling PCs in home and office in P2P manner • Volunteer computing (BOINC, etc) • Supporting only master-worker style applications 2nd NEGST workshop
Conclusion • We proposed a communication framework for P2P computing for HPC applications with high scalability • Easily programmable even through NATs • Scalable for a number of nodes without relay-server bottleneck • Performance evaluation on WAN environment • SoU library provides an acceptable performance • Relatively large cost to establish a connection, but negligible for long-term HPC applications • Our system has acceptable performance and scalability for HPC-P2P 2nd NEGST workshop
Related work • Generic studies : JXTA, NAT BLASTER, STUNT, OCALA and Skype A2A API … • NAT traversal techniques • Wide-Area Communication for Grids: An Integrated Solution to Connectivity, Performance and Security Problems [Alexandre et at al. HPDC’04] • Simultaneous TCP : Another TCP connection establishment procedure on RFC793 • User-level implementation • Usable under more particular condition than UDP hole punching • Overlay network without relays • Private Virtual Cluster: Infrastructure and Protocol for Instants Grids. [Ala et at al. Europar’06] • High application portability with TUN/TAP • Installation needs root authority 2nd NEGST workshop
NAT traversal techniques • Techniques to allow a direct communication among nodes behind NATs • UDP hole punching • The most widely used method and easy to implement on user-level • Communication is limited to UDP/IP • UPnP (Universal Plug and Play) • To configure hardware devices temporally through the network • UDP/IP and TCP/IP are available • Each NAT box must support the feature explicitly • They are used mainly in multimedia applications • VoIP (Skype, Google Talk, etc.) • Constant throughput is required for long period • Several amount of packet-loss is allowed without the retransmission for UDP/IP • For wider variety of applications, we need more concrete and easy to control communication methods 2nd NEGST workshop
Cost to establish a connection • Most preliminary result • TCP DMZ, SoU DMZ and TCP relay • Same as round-trip time • SoU + UDP hole punching • Negotiation, UDP hole punching and SoU are required • Similar to 7 times of round-trip time • For HPC, this is a little overhead The shortest time to establish a connection 2nd NEGST workshop
Cost to establish a connection Acceptable for HPC applications as a little overhead • RDUP+UDP hole punching requires • 7 times transmissions on WAN: • 1 time on DNS resolution • 4 times on sharing of address information • 1 time on UDP hole punching • 1 time on SoU connection establishment The shortest time to establish a connection 2nd NEGST workshop
Design of management system • Distributed “super-nodes” to manage the system • Name space management based on DHT (Distributed Hash Table) • Helps the negotiation among NATs for UDP hole punching • Relays packet only when it is necessary Server nodes Client nodes 2nd NEGST workshop
Structure of Management System Many super-node and many common nodes A server and many clients 2nd NEGST workshop
System design overview Monitoring the overlapping of the names our system Holding TCP connections with all client nodes Providing direct communication for data through NATs DHT (Distributed Hash Table) is used for consistent and scalable management 2nd NEGST workshop
System design overview Name resolution from virtual name to real IP address our system Node pair rendezvous for NAT traversal Providing direct communication for data through NATs 2nd NEGST workshop
Latency 11ms 10ms 15ms 2nd NEGST workshop
Cost to establish a connection • Most preliminary result • TCP DMZ, SoU DMZ, TCP relay • Request and replay on TCP or SoU = round-trip time • SoU + UDP hole punching • Negotiation, UDP hole punching and SoU’s establishment = round-trip time x 7 2nd NEGST workshop
The Procedure of UDP hole punching Information transfer through a server Server Reachable using a mapping information Reachable to Node-B to NAT-2:2000 × to NAT-2:2000 to NAT-1:1000 Automatically created This method is available with “Cone NATs” 2nd NEGST workshop
Reliable communication on UDP/IP • RI2N/UDP • Developed by JST-CREST “Mega-Scale Computing” Project • Basically designed for fault-tolerant communication on PC cluster with Ethernet • Based on UDP/IP, but provides TCP-like streaming communication, retransmission and simple congestion control algorithm • Porting to our communication layer for P2P computing⇒ RUDP (Reliable UDP) library All RI2N channels share only one UDP port for selective acknowledgements to share the failure information 2nd NEGST workshop