140 likes | 154 Views
Explore the challenges and solutions in provisioning and scheduling resources for peer-to-peer file sharing, enhancing data sharing services globally. Discusses SuperPeers and Grid provisioning approaches with experimental insights on application-specific scheduling policies.
E N D
VL-e Provisioning and Scheduling Resources for World-Wide Data-Sharing Services • Looking at Peer-to-Peer File-Sharing Environments A. Iosup, P. Garbacki, D.H.J. Epema PDS Group, ST/EWI, TU Delft IEEE e-Science Conference, 4-6 Dec 2006, Amsterdam, NL.
~70% P2P File-Sharing Courtesy CacheLogic World-Wide Data-Sharing is Growing at Incredible Pace • Broad definition: data-sharing applications in the Internet, e.g., BitTorrent, eDonkey, KaZaA, Gnutella. • Daily 85M users [Pouwelse, ICT Kenniscongres’06] • From 10% to 70% Internet traffic in5 years[Parker, CacheLogic’04 & ’06] • Average bandwidth is 500Kbps, double the one observed 2 years ago [Pouwelse et al., IPTPS’05; Iosup et al., GP2PC’06]
Motivating Scenario: P2P File-Sharing Environments • Data-sharing communities • Community peers collaborate in sharing the same file • Service – data sharing with reliability guarantees • Customers – scientists that need data, people that want a movie, etc. • Community operation and Problems • Bootstrap – scalability and availability issues; • Data delivery – depends on node properties (availability, uptime, connectivity), poor performance during flashcrowds; • Data search – polluted responses.
Solution (1/2): SuperPeers • SuperPeer = high capacity, high availability, peer • SuperPeer-based World-Scale Data-Sharing Environments • Assign SuperPeers additional responsibilities in the system:bootstrap peers, offer data at high speeds for high average community performance, discover and remove polluted content (this work) providing reliable world-scale data-sharing services = guaranteeing that enoughSuperPeersexist
Solution (2/2): Grid Provisioning • Deploy SuperPeer services on selected Grid nodes • Provide Grid resources, with NO impact on existing load • Assume co-allocation: the Grid is a big bag of resources • Need application-specific scheduling for SuperPeer services (this work)
Outline • Motivation and Goals • SuperPeers Provisioning and Scheduling • Experimental Setup • Results • Conclusion
SuperPeers Provisioning and Scheduling (1/3)System Model CC λCC λFC • P2P Communities • Common (CC) or flashcrowds (FC) • Duration – δCC, δFC [days] • Size – σCC, σFC [# peers] • Required capacity • Service dispatch rate α • Needed capacity Π= σ/α • Shared or exclusive Grid Provisioning and Scheduling Unit Sched Π=2 Grid nodes FC FC CC CC shared exclusive SuperPeers
Resource Provisioning and Scheduling (2/3)Performance Metrics Legend C: all communitiesCs: served communities P: all peersPs: served peers XRes: extra resources needed by communitiesSysQRes: already queued load in the Grid • Community Coverage • Peer Coverage • Overall Coverage • Extra Service Cost F , e.g., F = 1
Resource Provisioning and Scheduling (3/3)Application-specific Scheduling Policies • CommF – Communities Firstmaximize the number of served communities • PeerF – Peers Firstmaximize the number of served peers • OldF – Oldest Community First favor the oldest community • NewF – Newest Community Firstfavor the most recent community • NonExclF – Non-Exclusive Resources Firstfavor communities that can share resource usage
Experimental Setup • Trace-Based Simulator • Daily resource availability from Grid3, LCG, NG traces • Five Data Services load sizes: Low, Normal, High, Very High, Extreme Data Service Load Daily Resource Availability
Sample Results (1/2) • Normal load: 100% Overall Coverage • Higher loads: demand > supplyneedscheduling policies demand > supply
Sample Results (2/2) • X-axis: time • The scheduling policies behave as designed • CommF, PeerF optimize Community, Peer Coverage • OldF, NonExclF represent trade-offs • Random behaves surprisingly well • No optimal scheduling policy in all environments / all load sizes (our paper)
Conclusions and Ongoing Work • Grid for world-wide data-sharing services • Millions of potential users • SuperPeers & Grid resource provisioning • This work • Five application-specific scheduling policies • Trace-based simulations show that: • Normal load supported with NO impact on current load • High+ loads NOT (yet) fully supported (read paper) • System performance comparison (read paper) • Policy design space covered (read paper) • Ongoing work • Failures, service migration, finer time granularity • Actual implementation
Thank you! Questions? Remarks? Observations? p2p@ewi.tudelft.nl http://www.pds.ewi.tudelft.nl/~iosup/index.html[google: “iosup”]