Provisioning and Scheduling Resources for World-Wide Data-Sharing Services

VL-e Provisioning and Scheduling Resources for World-Wide Data-Sharing Services • Looking at Peer-to-Peer File-Sharing Environments A. Iosup, P. Garbacki, D.H.J. Epema PDS Group, ST/EWI, TU Delft IEEE e-Science Conference, 4-6 Dec 2006, Amsterdam, NL.

~70% P2P File-Sharing Courtesy CacheLogic World-Wide Data-Sharing is Growing at Incredible Pace • Broad definition: data-sharing applications in the Internet, e.g., BitTorrent, eDonkey, KaZaA, Gnutella. • Daily 85M users [Pouwelse, ICT Kenniscongres’06] • From 10% to 70% Internet traffic in5 years[Parker, CacheLogic’04 & ’06] • Average bandwidth is 500Kbps, double the one observed 2 years ago [Pouwelse et al., IPTPS’05; Iosup et al., GP2PC’06]

Motivating Scenario: P2P File-Sharing Environments • Data-sharing communities • Community peers collaborate in sharing the same file • Service – data sharing with reliability guarantees • Customers – scientists that need data, people that want a movie, etc. • Community operation and Problems • Bootstrap – scalability and availability issues; • Data delivery – depends on node properties (availability, uptime, connectivity), poor performance during flashcrowds; • Data search – polluted responses.

Solution (1/2): SuperPeers • SuperPeer = high capacity, high availability, peer • SuperPeer-based World-Scale Data-Sharing Environments • Assign SuperPeers additional responsibilities in the system:bootstrap peers, offer data at high speeds for high average community performance, discover and remove polluted content (this work) providing reliable world-scale data-sharing services = guaranteeing that enoughSuperPeersexist

Solution (2/2): Grid Provisioning • Deploy SuperPeer services on selected Grid nodes • Provide Grid resources, with NO impact on existing load • Assume co-allocation: the Grid is a big bag of resources • Need application-specific scheduling for SuperPeer services (this work)

Outline • Motivation and Goals • SuperPeers Provisioning and Scheduling • Experimental Setup • Results • Conclusion

SuperPeers Provisioning and Scheduling (1/3)System Model CC λCC λFC • P2P Communities • Common (CC) or flashcrowds (FC) • Duration – δCC, δFC [days] • Size – σCC, σFC [# peers] • Required capacity • Service dispatch rate α • Needed capacity Π= σ/α • Shared or exclusive Grid Provisioning and Scheduling Unit Sched Π=2 Grid nodes FC FC CC CC shared exclusive SuperPeers

Resource Provisioning and Scheduling (2/3)Performance Metrics Legend C: all communitiesCs: served communities P: all peersPs: served peers XRes: extra resources needed by communitiesSysQRes: already queued load in the Grid • Community Coverage • Peer Coverage • Overall Coverage • Extra Service Cost F , e.g., F = 1

Resource Provisioning and Scheduling (3/3)Application-specific Scheduling Policies • CommF – Communities Firstmaximize the number of served communities • PeerF – Peers Firstmaximize the number of served peers • OldF – Oldest Community First favor the oldest community • NewF – Newest Community Firstfavor the most recent community • NonExclF – Non-Exclusive Resources Firstfavor communities that can share resource usage

Experimental Setup • Trace-Based Simulator • Daily resource availability from Grid3, LCG, NG traces • Five Data Services load sizes: Low, Normal, High, Very High, Extreme Data Service Load Daily Resource Availability

Sample Results (1/2) • Normal load: 100% Overall Coverage • Higher loads: demand > supplyneedscheduling policies demand > supply

Sample Results (2/2) • X-axis: time • The scheduling policies behave as designed • CommF, PeerF optimize Community, Peer Coverage • OldF, NonExclF represent trade-offs • Random behaves surprisingly well • No optimal scheduling policy in all environments / all load sizes (our paper)

Conclusions and Ongoing Work • Grid for world-wide data-sharing services • Millions of potential users • SuperPeers & Grid resource provisioning • This work • Five application-specific scheduling policies • Trace-based simulations show that: • Normal load supported with NO impact on current load • High+ loads NOT (yet) fully supported (read paper) • System performance comparison (read paper) • Policy design space covered (read paper) • Ongoing work • Failures, service migration, finer time granularity • Actual implementation

Thank you! Questions? Remarks? Observations? p2p@ewi.tudelft.nl http://www.pds.ewi.tudelft.nl/~iosup/index.html[google: “iosup”]

Provisioning and Scheduling Resources for World-Wide Data-Sharing Services

Provisioning and Scheduling Resources for World-Wide Data-Sharing Services

Presentation Transcript

Sharing Resources

Sharing Resources

Scheduling Resources and Costs

Policy-Based Scheduling: Improving Resources Provisioning In OpenStack

Sharing Resources

Sharing will save everyone time and resources - DATA

Sharing Public Health Resources and Services

Right Sharing of World Resources

Scheduling Resources and Costs

xDSL Services Provisioning

Right Sharing of World Resources

Options for national and ASEAN wide data sharing

Bandwidth Scheduling and Provisioning in Access and Wide Area Networks

Issues in Provisioning Internet-wide VPN Services

Government Resources on the World Wide Web

Scheduling Resources

Sharing Resources

Data Provisioning Services for mobile clients

ENUM Services and their Provisioning

Scheduling Resources and Costs

Provisioning Other Services

Scheduling Resources and Costs