290 likes | 412 Views
Peer-to-peer Networks : promise and trouble. Bart Dhoedt Ghent University - Faculty of Applied Sciences Department of Information Technology (INTEC). e-mail : bart.dhoedt@intec.ugent.be phone : ++32 9 264 99 66. Presentation at NORDUnet Network Conference August 24-27, Reykjavik, 2003
E N D
Peer-to-peer Networks :promise and trouble. Bart Dhoedt Ghent University - Faculty of Applied Sciences Department of Information Technology (INTEC) e-mail : bart.dhoedt@intec.ugent.be phone : ++32 9 264 99 66 Presentation at NORDUnet Network Conference August 24-27, Reykjavik, 2003 Tuesday, August 27, 2003.
OUTLINE 1. Introduction 2. Taxonomy of P2P-systems 3. Issues in P2P-systems 4. P2P-trends 5. Concluding remarks
Defining P2P content computer cycles disk space liability bandwidth 1001010 • about sharing • symmetric (architectural view) • creating an application-level overlay network • decentralized • application critical infrastructure owned by many P2P is Software resources Hardware resources 1. Introduction
Sharing resources ? 1.5 Mprocessors disk storage : 1.5 PB processing power : 1.5 PFLOPS BW/link : 25 Kb/s • estimate of edge resources - available for P2P-network total number of Internet hosts : 150 M average disk capacity : 10 GB average available memory : 128 MB average processing power : 1 GFLOPS average BW : 100Kb/s 1% hosts 50% processing power 50% memory 10% disk space 25% network bandwidth 1. Introduction
Sharing resources ? IBM ASCI White 12.3 TFLOPS 8192 processors 512 RS/6000 processing nodes 6.2 TB memory storage 160 TB disk storage 110 M$ 106 tons P2P-supercomputer > x 10 ! 1.5 PFLOPS 1.5 M processors 92 TB memory storage 1.5 PB disk storage ? M$ ? tons • What about supercomputers ? 1. Introduction
P2P @ edge ? • How to unleash the power of the “Internet’s dark matter ?” 1. Introduction
P2P popularity 2003 summer download hit parade [Total] [Last week] P2P 1. Kazaa Media Desktop 2 644 777 261405295 2. ICQ Lite 588 141 25423064 3. AOL Instant Messenger (AIM) 532 897 17521190 4. iMesh 392 703 55145269 5. WinZip 351 865 100741790 6. ICQ Pro 2003a beta 332 624 233204712 7. Spybot – Search & Destroy 232 993 2764380 8. Ad-aware 224 720 19078555 9. Morpheus 179 347 114140262 10. DownloadAccelerator Plus 119 601 36355895 P2P P2P P2P P2P P2P [www.download.com] 1. Introduction
P2P popularity Napster : the early days … Gnutella network : up to 400 000 nodes operating world wide 1. Introduction
Architectural view Mediated P2P Pure P2P Hybrid P2P Early Gnutella FreeNet Gnutella FastTrack Kazaa Napster Audiogalaxy 2. Taxonomy
P2P-architectures mediated pure hybrid P2P P2P P2P data traffic control traffic local : client-server long distance : P2P client-server P2P efficiency + efficient search + efficient control - inefficient search - BW consuming +/- - control hot spot (mirrors needed ?) - BW needed grows rapidly good compromise scalability robustness - single point of failure - easy to attack + graceful degradation + difficult to attack ? accountability easy difficult difficult 2. Taxonomy
P2P taxonomy content sharing distributed computing instant messaging collaborative working mediated pure hybrid 2. Taxonomy
File Sharing performance 1.6 M downloads/day 150 M searches/day 10 TB data transfer/day 1-2 TB data transfer/day 100 servers 15000 servers 2. Taxonomy
Distributed computing performance 35 GB/tape 16 hours recorded data 10 tapes/week, 350 GB 10 000 0.3 MB work units • SETI • =“Search for extraterrestrial Intelligence” • started in 1998 as a 2 year project (but still running) • 4 M users signed up so far • Radio telescope data sent to clients for digital signal analysis • Nodes process data when cycles are available (works as screen saver) • Using resources to allow better signal analysis 2. Taxonomy
Distributed computing performance 22x1017 FLOP/day >25 TFLOPS SETI@home ASCI White@DoE Processing 25 TFLOPS 12.3 TFLOPS Cost 1 M USD 110 M USD computations per work unit 3.1x1012 FP-operations work unit throughput 700 000/day 2. Taxonomy
Scaling problems Mechanisms in GNUTELLA to limit traffic • Network horizon set by TTL • Descriptor ID’s avoid cyclic routing • PONG/QueryHIT/Push NOT flooded BUT ... “1 Gnutella request would cause 90MB data traffic on Napster scale network” 3. Issues
Scaling answers high BW access 1. Reduce network horizon to reduce f 2. Use of reflectors = node with high BW available - mimics peer sharing all files of its “clients” 3. Use of UltraPeers = same principle as reflector, but chosen dynamically low access BW handles all PING/PONG QUERY/QUERYHIT Traffic handle ONLY download traffic 3. Issues
Robustness • self-organization leads to power-law networks • (1% of servents shows server-like behaviour …) • very robust to random node failure • more vulnerable to targeted attacks Simulation result for FreeNet peers [T. Hong, “Performance”, Chapter 14 in “Peer-to-peer : Harnessing theBenefits of a Disruptive Technology”, ISBN 0-596-00110-X, O’Reilly,March 2001.] 3. Issues
Free-riding on Gnutella Network size since Jan 2002 • only 30 % of nodes offering content • 50% of queries satisfied by 1% of servents [www.limewire.com] 3. Issues
Overlay mismatch Mismatch between application layer network and physical network based on network traffic analysis • 40% Gnutella clients belong to top 10% AS • only 2-5% links within AS based on domain names Gnutella’s clustering logic shows no/little correlation with domain name based clustering [M. Ripeanu, A. Iamnichi, I. Foster, “Mapping the Gnutella Network”, IEEE Internet Computing, January-February 2002.] 3. Issues
Business Models ? How to monetise P2P ? • authors agree on “P2P business models are unclear” • reality : few companies make money on P2P • current situation : File sharing application sponsored by advertisement (banners) • some other possibilities • micropayment mechanisms • indirect mechanisms (P2P will increase BW-need and hence …) • tip based strategy (cf. US-model …) • make “low”-quality content available to get people interested in specific content • make use of end users devices to reduce cost ! 3. Issues
Problems/issues/barriers/challenges File-sharing : content redundancy Cycle-sharing : checkpointing ? Hybrid approach Avoid floodings (e.g. FreeNet : intelligent routing) Content/Query caching TTL Avoid routing cycles (Ab)use of port 80 Rendez-vous servers Problems Solutions node/link transient nature robustness scalability bandwidth consumption Network discontinuities (firewalls, (dynamic) NAT) 3. Issues
Problems/issues/barriers/challenges Encryption techniques (e.g. FreeNet : plausible deniability for node operators) ? P2P-frameworks micro-payment combine uplink capacity (e-donkey) Network/infrastructure aware routing ??? Solutions Problems Privacy/trust Anonymity application redesign free-riding accountability asymmetric bandwidth in access (ADSL, HFC) inefficient overlay business models ? 3. Issues
P2P-trends • emergence of platforms • convergence between Grid-computing and P2P-technology • enhance P2P-performance • semantic searches (Tapestry, Content Addressable Networks …) • Query/result caching 4. Trends
Platform emergence Dedicated Application Programs and Protocols Platforms Frameworks ? Freenet ? Gnutella • for 1 application area • non-generic • 1 application class • 1 specific problem • network interoperability ? • offer generic services • support the P2P paradigm • used to build P2P applications ? ? SETI@home ? Groove eDonkey ? Application areas File sharing Distributedcomputing Instant Messaging Collaboration 4. Trends
JXTA • developed by Sun Microsystems • set of 6 XML based open protocols • Java API offered e-mail auctioning data storage indexing searching file sharing JXTA Community Applications Sun JXTA Applications Applications JXTA Shell peer establishment communication management routing Peer Commands JXTA Community Services Sun JXTA Services Services Peer Groups Peer Pipes Peer Monitoring Core Security [http://www.jxta.org] 4. Trends
BOINC • Berkeley Open Infrastructure for Network Computing • allows participants to participate to solve selected problems • = “generic SETI@Home” [http://boinc.berkeley.edu] 4. Trends
Conclusions P2P applications can be very BW-consuming • extremely popular (and addictive) • use of inefficient strategies (broadcast, flooding, …) • “tragedy of the commons” Danger for Bottlenecks • overlay network has little relation to physical infrastructure • symmetric relations between peers Change in user behaviour • “always” online • information provider AND information consumer For network operators 5. Conclusions
Conclusions People are (extremely) interested in digital content People are willing to share resources for free (and even want to spend money …) • make people feel they participate in a large project • give some credit to users (competition) (top 10 list, eternal fame if solution is found, …) To avoid digging ones own grave • avoid BW-consuming strategies • include micropayment/trust mechanisms as - encouragement to participate - avoid free-riding - avoid DoS attacks For application developers 5. Conclusions
Conclusions Hacker danger • need for encryption mechanisms High performance P2P-platforms are emergent • reuse of efforts • reuse of user community Make sure your application has some scaling effect • the more users, the more interesting to join ! For application developers 5. Conclusions