210 likes | 296 Views
Design and Implementation of HTTP-Gnutella Gateway. Baoning Wu (baw4) Wei Zhang (wez5) CSE Department Lehigh University. Motivation. Peer-to-peer networking is a hot topic. Can P2P nodes search and get files from Web sites? Can one P2P network search and get files from other P2P networks?
E N D
Design and Implementation of HTTP-Gnutella Gateway Baoning Wu (baw4) Wei Zhang (wez5) CSE Department Lehigh University
Motivation • Peer-to-peer networking is a hot topic. • Can P2P nodes search and get files from Web sites? • Can one P2P network search and get files from other P2P networks? • In our project, we have built a special gateway between Gnutella and Web sites.
Related Work • David McNab has launched Freenet search engine. • Asiayeah is a Gnutella search engine. • Filedonkey.com is an Edonkey search engine. • Kalepa Networks , Inc is doing work about connecting different P2P systems. • Our work is kind of reverse to all above works.
Mechanism of Gnutella Searching • Node A sends a query to its neighbor B; • Node B boardcasts the query to its neighors C, D; • Node C has the objects node A needs and then returns a query hit message to node B; • Node B forwards the query hit message by consulting the local states.
Mechanism of the gateway • Node A broadcasts a query message directly or indirectly to the HTTP-Gnutella gateway; • The HTTP-Gnutella gateway forwards the translated query message to search engine; • The search engine returns a bunch of query results to the gateway; • The gateway translates the results into Gnutella formats and then forwards them to node A; • If node A initializes a download requests to the gateway, the gateway will translate the Gnutella request into a well-formatted HTTP request to the Web server; • The gateway fetches the data from the Web server; • The gateway forwards the data from the Web server to node A.
Handle Query Messages • We still use the original Gnutella mechanism to judge whether to forward the message or not. • The gateway captures all of queries with hops# < 5 and sends them to search engine.
Search Engine API • Google search engine API has a limit of up to 1,000 requests per day. • Search engine API consists of three main functions: • Query conversion • Extraction of URLs • Measurement of content size
Generate Query Hit Messages • Two considerations: • Let Gnutella nodes contact Web servers directly • Let the gateway work as a proxy • The gateway fills its own IP address and a specific port number (currently 9999) in the query hit messages. • File names are URLs of Web objects.
Downloading Service • Translate Gnutella download request into a well-formatted HTTP request. e.g. GET /get/1234/http://www.foo.com/foo.mp3 HTTP/1.1 User-Agent: Gnutella Host: 123.123.123.123:6346 => GET http://www.foo.com/foo.mp3 HTTP/1.1 User-Agent: Gnutella Host: www.foo.com • It should handle Gnutella handshakes properly. • It also records the bytes transferred.
Problems & Solutions • Irregular handshakes • We handle all possibilites • File size • We use HTTP HEAD request to get file size • Broken Pipe signal • We use forked process
Experiment Results • Outline • Basic verification and validation • Log file format • Results #1 to #4
Basic Verification & Validation • Run our special gateway on machine 1 and run a normal gtk-gnutalla client on machine 2. After machine 2 connects to machine 1, we use machine 2 to send query messages and downloading request to machine 1. • For downloaded files from machine 1, we use wget to get the same file from web server directly and use diff to test if they are identical.
Log File Format • Log 1 • Time stamp, MUID, IP address, Type, Query • Log 2 • Time stamp, IP address, URL, Size, Code, Success
Results #1 • No. of Query messages: 319,245 • No. of Query Hit messages: 930,860 • No. of served requests: 113,391 • Average Response Time: 16.33 seconds
Result #3 • No. of Downloading requests: 952 • No. of Different IP addresses: 67 • No. of served Requests: 945 • No. of sucessfully served requests: 740 • Total size transfered: 244,227,881 bytes • Average response time: 3.15 seconds • Average total download time: 15.92 seconds
Future Work • Support a variety of file types and measure their popularity • Build a gateway to connect different P2P systems • Deployment of such gateways
Conclusion • An HTTP-Gnutella gateway was built and worked for the Gnutella users. • Only 5 days, the gateway transferred about 244MB data from the Web sites to the Gnutella nodes. • The systems achieved all goals of our design.