230 likes | 319 Views
ISP-aided Biased Query Search in P2P Systems. Vinay Aggarwal and Anja Feldmann Vinay.Aggarwal@telekom.de Deutsche Telekom Laboratories / TU Berlin Berlin, Germany. Introduction. P2P traffic >50% of Internet traffic Bittorrent, eDonkey, Skype, GoogleTalk…
E N D
ISP-aided Biased Query Search in P2P Systems Vinay Aggarwal andAnja Feldmann Vinay.Aggarwal@telekom.de Deutsche Telekom Laboratories / TU Berlin Berlin, Germany
Introduction • P2P traffic >50% of Internet traffic • Bittorrent, eDonkey, Skype, GoogleTalk… • P2P systems form overlays at application layer • neighbour selection arbitrary • routing independent of Internet AS routing • Routing layer functionality duplicated at application layer • P2P users want performance • Measure topology themselves (use RTT) overhead • Build topology agnostic of underlay performance loss • ISPs in a dilemma • P2P spurs broadband demand, still ISPs lose money • Traffic Engineering difficult with P2P traffic • Lack of coordination Tension!
ISP-P2P tension Random/RTT-based peer selection peerings cross ISP boundaries multiple times, often unnecessarily
Solution: ISP-P2P Cooperation • Concept: ISP knows its network • Node: last-hop bandwidth, geographical location, service class • Routing policy, OSPF/BGP metrics, AS distance to other ISPs • Each ISP offers Oracle service • P2P nodes query it during neighbour selection or file exchange, send list of potential neighbours • Oracle ranks these by proximity • Inside network, last-hop bandwidth, geographical location (city/PoP), AS hops • ISP-aided optimal P2P neighbour selection • Simple and general solution, open for all overlays • Run as Web server or UDP service at known location
Advantage for ISP/P2P • Measurement overhead eliminated • Utilize knowledge of ISP • Avoid high-latency paths and bottlenecks at inter-ISP transit/peering links • ISPs regain control of network traffic • Traffic across ISP boundaries reduced immense cost savings • Better QoS to other applications, improved service to customers
Impact on network structure • Node degree and mean overlay path length unchanged • Graph remains connected, overlay & underlay diameter constant • Large improvement in AS distance and intra-AS peerings • Impact on flow conductance minimal • Densely connected subgraphs local to ISPs • P2P topology correlated with AS topology
Overlay-Underlay Topology Correlation Random vs biased Gnutella topology
Why Testlab? • Real traffic instead of simulated flows • Configure network devices (routers, switches, machines) • Generate variegated network scenarios and traffic environments • Wide range of experiments using real applications, network stacks, OS • Better control & visibility vs Internet • No adverse effect on Internet traffic
Experimental Topologies • Internet consists of Autonomous Systems (AS) • Prefix-based packet forwarding, based on AS policies • P2P systems setup overlay topology • Implement own routing on query/key basis • Design multiple-AS topology, each AS hosts multiple P2P users • Router is an abstraction of AS boundary • 5 routers 5-AS topology • Each router connects 3 machines, each machine runs 3 P2P applications concurrently • 5 ASes, 15 machines, 45 P2P users
AS Topologies Realistic Topology Ring Topology Star Topology Tree Topology
Configuration of a topology …using VLAN, VTP, ifconfig, route
P2P System: Gnutella • Unstructured, open-source, popular file-sharing P2P system • Each servent bootstraps by flooding Pings to known nodes, answered by Pongs • Search content by flooding Query, answered by QueryHit (QH) • Msgs carry TTL (max 7) and msg ID • Servent selects a node randomly from all QHs to download desired content from • File exchange using HTTP, outside Gnutella • Ultrapeers (UP) and leafs form 2-level hierarchy
Experimental Setup • Each machine has 1 UP & 2 leafs, all run GTK-Gnutella • Central machine runs oracle • Servents send list of IPs to oracle, which sorts them according to parent AS and AS-hop distance, returns list to servent • File-sharing schemes • Uniform: 6 unique files on all servents • Variable: UP-12, a leaf-6, other leaf-0 files • Compare number of responses to Query • Each servent introduces a unique Query string • Realistic query string distribution (mp3, album/artist) • Run unmodified and biased P2P experiments
Number of Query Messages => 50% reduction with ISP-aided biased P2P neighbour selection
Query responses (Uniform FS) => Query responses with biased P2P (dotted) similar to unbiased P2P (bold)
Query Responses (Variable FS) => Effects similar across file distribution patterns and topologies
Query responses (rare queries) => Effects hold even for different query types
Large scale simulations • SSFNet: discrete-event, packet-level simulator • 700 node P2P, 16 ASes, churn, free-riding • behaviour as observed in real world • Query traffic reduced by 54% • Swarming pattern of Queries benefits • Reachability at remote locations improves • Network discovery traffic reduced by 42% • Number of responses per Query similar • Number of unsuccessful Queries same
Responses per Query • Distribution of queries similar • Mean: 127 vs 102, Median: 78 vs 62
Conclusion & Future Work • Unique and simple ISP-P2P collaboration concept, so that both benefit • Scalability of P2P networks improves • Negotiation and query traffic reduced by 50% • No adverse effects on query search process • Stable across popular, rare, unsuccessful Queries • Reachability of Queries at remote locations improves • Advantages hold across topologies and scale • Planetlab experiments coming…