140 likes | 286 Views
Efficient P2P backup through buffering at the edge. S. Defrance, A.-M. Kermarrec (INRIA), E. Le Merrer, N. Le Scouarnec, G. Straub, A. van Kempen. Peer to Peer backup system. Exploit users’ ressources : each user provides storage space. « Pure » P2P backup systems severely limited by:
E N D
Efficient P2P backup through buffering at the edge S. Defrance, A.-M. Kermarrec (INRIA), E. Le Merrer, N. Le Scouarnec, G. Straub, A. van Kempen
Peer to Peer backup system Exploit users’ ressources : each user provides storage space • « Pure » P2P backup systems severely limited by: • Low availability • Asymmetric bandwidth (Low uplink speed) • Asynchrony Peer 1 Peer 2 12 h 24 h 0 h Time To Backup (TTB) and Time to restore (TTR) data may be very high Practical deployment is limited
CDN-assisted architecture Architecture proposed in P2P 2010 : Server = Reliable component The performances of client-server systems are approached (in terms of Time To Backup and Time To Restore data) • However : • A centralized part remains • Not fully convenient for users
What we propose To take into account the low-level structure of network (i.e the presence of gateways in home networks) To use gateways to distribute the centralized part of the hybrid scheme LAN LAN Home network(LAN) LAN Gateways are turned into stable buffering layers Mask the asynchrony between peers
Why gateways are good candidates ? Home network • Already present in users 'homes • Storage capable (for buffering) • Highly available • At the frontier between a fast LAN and a slow WAN
Gateways are highly available • We periodically pinged a random set of static IP of a french ISP* • 25,000 gateways • For 7.5 months • Average gateway availability : 86 % • Large part is very stable • A few have power-off habits (daily or holiday basis) *The trace is available at : http://www.thlab.net/~lemerrere/trace_gateways
How does it work ? Prepare (LAN speed) Backup (WAN speed) Offload (LAN speed)
How do we evaluate ? Trace-based simulation using public traces • To model peers behavior : • -Skype 28 Days 1269 Peers AvailabilityMean = 0.5 • -Jabber 28 Days 465 Peers AvailabilityMean = 0.27 Scenario: Size of archive : 1GB Data creation : Poisson process (3 backups/month/user avg) Erasure code 50 simulations/curve • To model gateways behavior : our gateway trace • To model bandwidth uplink : trace from a study of residential broadband networks • UplinkMean = 66 kB/s We randomly assign one gateway and one uplink speed to one peer of each trace
What do we evaluate ? • We evaluate : • Time To Backup (Hours) • Time To Restore (Hours) • Mean and Max data buffered (Mbytes) TTB : Time between the backup request and the time when the last block has been completely uploaded TTR : Time between the restore request and the time we downloaded enough data to reconstruct the file We compare : Pure P2P(P2P) Gateway-Assisted(GWA) CDN-Assisted (CDNA)
TTB & TTR (Skype trace) • Time To Backup (Stored safely at remote place) • Time To Restore (Retrieve an archive locally)
Scaling (Skype trace) Better scaling with archive size : This enables users to backup larger amounts of data
Dimensioning (Skype trace) • Low storage needs 1GB archives: 2.5GB needed (99%) Realistic for current gateways • Average usage remains low • Less than 1MB here • Data is really offloaded to peers • Gateway effectively used as buffers Stopping backups Average storage on gateways (MB)
Conclusion • Realistic architecture for P2P backup systems • Evaluation using trace-based simulation • TTB and TTR are greatly reduced • (Network connection can be used more efficiently) • More convenient for users : • Let to offload backup tasks quickly (LAN speed) from the user’s machine to the gateway • Fully decentralized • Trace of gateway availability